What is XML?
XML is a way of adding intelligence to your documents. It lets you identify each element using meaningful tags and it lets you add information ("metatdata") about each element.
XML is very much a part of the future of Web, and part of the future for all electronic information.
XML is a syntax for marking up data and it works with many other technologies to display and process information. It looks and feels very much like HTML.
XML isn't going to replace everything else you've already learned; it complements it and extends it.
What's the Fuss About?
XML lets you make documents smarter, more portable, and more powerful -- that's the promise of XML and that's what all the fuss is about.
XML allows you to use your own tags to define parts of a document. You can do this because XML is a descriptive, not a procedural, language. That is, XML describes what something is rather than performing an action.
For example, take a look at the front page of a newspaper. You'll see different font sizes, different sections, and columns.
If you were to create a Web page for that newspaper--using the same formatting and styles--you would use tags such as
and to define the size and color of a large headline, or to italicize a word such as a byline, in order to distinguish it from the rest of the text.
But just try to write tags that actually explain that you've got a Headline and that the words "John Smith" make up a byline. HTML won't know what you're talking about if you create tags such as or or .
XML, with help from other technologies such as CSS, understands what the elements are and how to display them.
That means, in the future, when you're searching on the web for say, a Barbie doll for your niece's birthday, you'll get Barbie the DOLL instead of some other type of Barbie, because the Barbie doll page might be marked up like this:
Pretty cool, huh?
XML documents can be moved to any format on any platform -- without the elements losing their meaning. That means you can publish the same information to a web browser, a PDA, or a network-enabled bread machine and each device would use the information appropriately.
The most important thing to remember about XML, though, it that is doesn't stand alone. It needs other technologoies, like CSS, in for you to see its results.
If all of this seems like a pain, and you don't want to mess with XML, it's OK. You don't need it to make a great web page. But you never know when organization will come in handy.
Where Did XML Come From?
XML is a simplified version of SGML and a cousin of HTML. It was developed by members of the W3C and released as a recommendation by the W3C in February 1998.
SGML, the parent of XML, is an international standard that has been in use as a markup language primarily for technical documentation and government applications since the early 1980s. It was developed to standardize the production process for large document sets. Think: Medical records. Company databases. Aircraft parts catalogs. Other really huge documents.
Marking-up documents in SGML allows information to be passed from one system to the next without losing information. With databases marked-up in SGML you can see what Widget A is all about and go check to see if Widget A is in stock.
Early on, people thought that SGML would be useful for the Web. In fact, HTML is really an very basic application of SGML! But HTML quickly became used for visual layout, so a group of people returned to the basics, determined to create something that had the strengths of SGML without being so difficult to implement -- and had the ease of use of HTML, but with more structural power. The result was XML.
The design goals of XML, taken from the XML Specification are:
XML shall be straightforwardly usable over the Internet.
XML shall support a wide variety of applications.
XML shall be compatible with SGML.
It shall be easy to write programs which process XML documents.
The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
XML documents should be human-legible and reasonably clear.
The XML design should be prepared quickly.
The design of XML shall be formal and concise.
XML documents shall be easy to create.
Terseness in XML markup is of minimal importance.
In other words, XML is easy to create, easy to read, and designed for use over the Internet. What more could a Web designer ask for?
What Does XML Look Like?
If you've ever used HTML, XML is going to look very familiar!
When you view the source of a document written in XML the first thing you'll see is the XML declaration, which looks like this:
Then, in the body of the document, you'll see a lot of tags. The tags look familiar at first -- they start with the usual less than sign and end with the usual greater than sign, like this:
But then you'll notice that the tags might not be quite the names you've come to expect! You'll see tags that seem to be made-up tag names. Tags like and and . In fact, if you view the source of an XML document, you'll see tags surrounding lots of words, maybe every word in the document. These tags define exactly what the content is. And the creator of the document had the power to create his or her own specific set of tags.
Suppose you're looking at a Web page marked up in XML on The Canterbury Tales by Chaucer. You're looking specifically at lines 282-286 of "The Physician's Tale." The document source for that section might look like this:
The Physician's Tale
That no man woot therof but God and he.
For be he lewed man, or ellis lered,
He noot how soone that he shal been afered.
Therfore I rede yow this conseil take --
Forsaketh synne, er synne yow forsake.
The tags simply define that:
1) This document is the Canterbury Tales.
2) This section is the Physician's Tale.
3) Each line of the Physician's Tale is defined.
4) Each line ends, and the Physician's Tale and The Canterbury Tales end.
If the entire document were marked up such as this, you could easily jump to a certain line or section. The entire document is annotated for easy reference and searching, and instead of viewing the entire document, users could request only specific sections of a document--simply by calling the specific tags they want. Oh, and we don't recommend that you manually type out each line in the Canterbury Tales. Get a computer to count the lines for you.