There are thousands of formally established XML applications from the W3C and other standards bodies, such as OASIS and the Object Management Group. There are even more informal, unstandardized applications from individuals and corporations, such as Microsoft's Channel Definition Format and John Guajardo's Mind Reading Markup Language. This book cannot cover them all, any more than a book on Java could discuss every program that has ever been or might ever be written in Java. This book focuses primarily on XML itself. It covers the fundamental rules that all XML documents and authors must adhere to, from a web designer who uses SMIL to add animations to web pages to a C++ programmer who uses SOAP to exchange serialized objects with a remote database.
This book also covers generic supporting technologies that have been layered on top of XML and are used across a wide range of XML applications. These technologies include:
XLink
An attribute-based syntax for hyperlinks between XML and non-XML documents that provide the simple, one-directional links familiar from HTML, multidirectional links between many documents, and links between documents to which you don't have write access.
XSLT
An XML application that describes transformations from one document to another in either the same or different XML vocabularies.
XPointer
A syntax for URI fragment identifiers that selects particular parts of the XML document referred to by the URIoften used in conjunction with an XLink.
XPath
A non-XML syntax used by both XPointer and XSLT for identifying particular pieces of XML documents. For example, an XPath can locate the third address
element in the document or all elements with an email
attribute whose value is elharo@metalab.unc.edu
.
XInclude
A means of assembling large XML documents by combining other complete documents and document fragments.
Namespaces
A means of distinguishing between elements and attributes from different XML vocabularies that have the same name; for instance, the title of a book and the title of a web page in a web page about books.
Schemas
An XML vocabulary for describing the permissible contents of XML documents from other XML vocabularies.
SAX
The Simple API for XML, an event-based application programming interface implemented by many XML parsers.
DOM
The Document Object Model, a language-neutral, tree-oriented API that treats an XML document as a set of nested objects with various properties.
XHTML
An XMLized version of HTML that can be extended with other XML applications, such as MathML and SVG.
RDDL
The Resource Directory Description Language, an XML application based on XHTML for documents placed at the end of namespace URLs.
All these technologies, whether defined in XML (XLinks, XSLT, namespaces, schemas, XHTML, XInclude, and RDDL) or in another syntax (XPointers, XPath, SAX, and DOM), are used in many different XML applications.
This book does not provide in-depth coverage of XML applications that are relevant to only some users of XML, such as:
SVG
Scalable Vector Graphics, a W3C-endorsed standard XML encoding of line art.
MathML
The Mathematical Markup Language, a W3C-endorsed standard XML application used for embedding equations in web pages and other documents.
RDF
The Resource Description Framework, a W3C-standard XML application used for describing resources, with a particular focus on the sort of metadata one might find in a library card catalog.
Occasionally we use one or more of these applications in an example, but we do not cover all aspects of the relevant vocabulary in depth. While interesting and important, these applications (and thousands more like them) are intended primarily for use with special software that knows their formats intimately. For instance, most graphic designers do not work directly with SVG. Instead, they use their customary tools, such as Adobe Illustrator, to create SVG documents. They may not even know they're using XML.
This book focuses on standards that are relevant to almost all developers working with XML. We investigate XML technologies that span a wide range of XML applications, not those that are relevant only within a few restricted domains.
What's New in the Third Edition
XML has not stood still in the two years since the second edition of XML in a Nutshell was published. The single most obvious change is that this edition now covers XML 1.1. However, the genuine changes in XML 1.1 are not as large as a .1 version number increase would imply. In fact, if you don't speak Mongolian, Burmese, Amharic, Cambodian, or a few other less common languages, there's very little new material of interest in XML 1.1. In almost every way that practically matters, XML 1.0 and 1.1 are the same. Certainly there's a lot less difference between XML 1.0 and XML 1.1 than there was between Java 1.0 and Java 1.1. Therefore, we will mostly discuss XML in this book as one unified thing, and only refer specifically to XML 1.1 on those rare occasions where the two versions are in fact different. Probably about 98% of this book applies equally well to both XML 1.0 and XML 1.1.
We have also added a new chapter covering XInclude, a recent W3C invention for assembling large documents out of smaller documents and pieces thereof. Elliotte is responsible for almost half of the early implementations of XInclude, as well as having written possibly the first book that used XInclude as an integral part of the production process, so it's a subject of particular interest to us. Other chapters throughout the book have been rewritten to reflect the impact of XML 1.1 on their subject matter, as well as independent changes their technologies have undergone in the last two years. Many topics have been upgraded to the latest versions of various specifications, including:
SAX 2.0.1
Namespaces 1.1
DOM Level 3
XPointer 1.0
Unicode 4.0.1
Finally, many small errors and omissions were corrected throughout the book.