Managing information networks with Topic Maps   Table of contents   Indexes   Microsoft's vision for XML

 
 

XML: The Universal Publishing Format


 
Jon   Bosak
  Online Information Technology Architect
  Sun Microsystems
901 St Antonio Rd
MPK17-101
Palo Alto   California  USA  94303
Phone: 001 650 786 6820
Fax: 001 650 786 5727
Email: Jon.Bosak@eng.Sun.COM
 
Biographical notice:
 
Jon Bosak
 
Jon Bosak is the Online Information Technology Architect for Sun Microsystems. He started and organized the XML Working Group of the World Wide Web Consortium (W3C) in 1996 and has served as Chairman of the XML WG since its inception. He is also a member of the W3C XSL Working Group, the W3C Hypertext Coordination Group, and the W3C Metadata Coordination Group. He is Sun's representative to ISO/IEC JTC1/WG4, which is the international standards group responsible for SGML, HyTime, and DSSSL, and is Sun's representative to its U.S. national counterpart, NCITS V1. He is a member of the International World Wide Web Conference Committee, a founding member of SGML Open (now the Organization for the Advancement of Structured Information Standards), and was for several years a sponsor of the Davenport Group, which maintains the industry-standard DocBook markup language for software documentation used by SunSoft and a number of other software vendors. Jon was a primary contributor to the SGML-based "AnswerBook2" Web strategy used for the distribution of Solaris documentation. Before joining Sun, he was responsible for architecting the SGML-based delivery system used by Novell to put its documentation on CDs and later on the World Wide Web.
 
ABSTRACT:
 
Lost in the excitement over the technical vistas opened by XML is its revolutionary potential to free users from the tyranny of proprietary publishing formats. XML can do for data what Java has done for programs: make it platform- and vendor-independent. And in conjunction with XSL, XML can replace the Babel of incompatible binary word processing formats with open, human-readable standards that may radically change both electronic and print publishing.
 
Much has been made recently of the role that XML will play in moving data across the World Wide Web. And there can be no doubt that XML will foster an explosion of new applications in areas such as metadata, interprocess communication, object serialization, and database exchange. XML is now considered by the World Wide Web Consortium to be the foundation syntax for many of its current and future specifications in fields ranging from content ratings to database schemas and the representation of mathematical formulae.
 
But lost in the excitement over the vistas opened by XML in a data-centric online world are the implications of the coming widespread availability of what is, after all, just a streamlined version of a time-tested and well-understood document representation technology. The fact that XML has suddenly been found useful by the online database community in no way lessens its ability to perform the publishing function for which it was originally designed -- it just ensures that this technology will be distributed throughout a population that is orders of magnitude larger than when this same technology was marketed as SGML. In other words, the traditional technical and social agenda of SGML can now be realized on a world-wide scale. And what many people in the traditional SGML community already know, but the world at large has yet to understand, is that the SGML agenda now being carried forward by XML is profoundly revolutionary.
 
Consider what SGML is all about from the user's standpoint. SGML is basically about the ownership of content. SGML says that content belongs to its creators, not to the makers of document creation tools. The alternative that SGML offers has always been very clear: either create your content in a proprietary word processing or desktop publishing format and bind yourself to a perpetual upgrade relationship with a particular vendor, or choose the SGML road and work with vendors whose business case is built on interoperability. XML does nothing to change this basic choice. What it does is to guarantee that the user-empowering, vendor-independent approach will relatively quickly go from a position espoused by a few true believers to the foundation upon which the world will be building its online communication infrastructure. It means that doing the right thing will become the majority view. And this means a fundamental shift in the relationship between the producers and consumers of document creation software.
 
This is a bold proposition, but it flows from what I believe to be an inevitable development in the progress of XML. In itself, XML, like SGML, conveys no semantic information. In particular -- unlike a specific tag set such as HTML -- the XML metalanguage does not and cannot convey information about presentation. All that an XML document can convey in isolation is structure and character data. For presentation, or indeed any other kind of behavior to occur, an XML document must be accompanied by something else that provides meaning for its markup. The "something else" can take many forms -- a custom program or script, an industry agreement about semantics, or a set of Java beans, to take some obvious examples. In publishing, the "something else" will typically be a stylesheet. So when we talk about the use of XML in publishing, we typically mean XML used in conjunction with a stylesheet language that instructs a formatter or display environment in how to treat the elements of the document. Such a language must be equally capable of meeting the requirements of both print and online publishing. The stylesheet language that is being developed specifically for this purpose is XSL.
 
The observation on which I base my position is simply this: The combination of XML and XSL can replace most existing proprietary word processing and desktop publishing formats with a single set of open standards for document content and style. The implication is equally simple: in a world where all popular document content and management tools use standardized interoperable document formats, users will no longer be held hostage to the upgrade strategies of particular vendors. And a world in which users are free to change vendors without putting their existing document base at risk is one with a very different dynamic from the one we have come to know in the publishing industry.
 
While such a development would obviously be very much to the advantage of commercial publishers and everyone else who produces documents on a large scale, it would equally obviously be to the disadvantage of vendors who rely on proprietary formats to lock in a user base as an essential part of their marketing strategy. It is only reasonable to expect such commercial interests to attempt to subvert the basically open nature of the XML family of standards and re-establish proprietary control within the new framework by introducing nonstandard features or by limiting the power of standardized output mechanisms. Such efforts must be anticipated and resisted if we as users are ultimately to win the struggle to own our own data.

Managing information networks with Topic Maps   Table of contents   Indexes   Microsoft's vision for XML