| Using XML in a Software Diagnostic Tool | Table of contents | Indexes | XML and related standards for data warehouses | |||
| DuCharme, Bob Moody's Investors Service New York ![]() | Bob DuCharme |
| Assistant Vice President |
| Moody's Investors Service |
| 99 Church St.
New York
(New York)
(10007)
Email: bob@snee.com |
| Biography |
Beyond HTML "Links" |
XLink ![]() | XLink offers some of the most exciting yet confusing possibilities for XML applications of the future. While most XML-related standards give us the XML way to perform tasks that people have performed with other software and standards for years (for example, XSL gives us ways to rearrange and format structured data), XLink brings a new range of capabilities that many people can't even picture how to use. While many of these features have been available in the HyTime and hypertext research communities for years, XLink will make them available on such a scale that we can look forward to completely new products built on these capabilities. Where will these products come from and what will they do? Clues and inspiration are around us if we look hard enough. |
| HTML linking | First, to really understand XLink's power, we have to move beyond the narrow definition of "link" in most Web user's minds: "a string of text or an image that, when clicked, displays a different Web document or a different portion of the same document." Beyond this definition, the possibilities start to open up. |
| This last advantage is one that confuses many people about XLink—why would you want to create a link between two resources that you can't edit? What value would such a link provide? For a clue, we can look to the relational database world, which has been doing it for years. |
Relational Databases and Links |
| If you're designing a database to keep track of a mail-order catalog's orders, you wouldn't set up a single table to store the customer's first name, last name, address, credit card number, the date of the order, the stock number of the item, the quantity of the item, the color of the item, and the item's supplier. After Meryl X. Customer ordered more than one item, you'd have multiple copies of her address and other fields, which wastes storage space and makes updates more difficult. |
| Instead, you'd store the customer's name, address, credit card number, and customer number in one table, the item's stock number, supplier number, and available sizes in another table, and the supplier number and relevant supplier information in a third table. When Meryl (customer MXC4255) orders her two pairs of jeans (stock number 81664), you'd add a new record to a fourth table, the more concise "orders" table that merely stores the customer number, the stock number, the quantity, and the date. |
| An invoicing program could then use this record to look up customer MXC4255's name and stock number 81664's name and then print the customer name and item name on the same invoice, even though they were stored separately. The new record in the orders table that makes this possible defines a relationship between customer MXC4255 and stock number 81664. In other words, it links them. |
| out-of-line links | This order record demonstrates two important things about a link: first, we don't need to edit either of the linked resources to create the link between them—creating the link didn't require any changes to the customer table or stock table. This is what XLink calls "out-of-line" links to distinguish them from in-line links, which include linking information in one end of the link (for example, in the HREF attribute of an HTMLA attribute). Secondly, the link created by the new record in the orders table shows how a link can add useful new information about the relationship—in this case, the quantity ordered and the order date. |
| In some ways, this link is more important than the information it links. It's important for a company to keep track of their inventory and potential customers, but orders bring in revenue. A collection of order "links" adds further value by making it possible to analyze buying patterns based on other information such as which items were purchased at which times of year. |
Out-of-Line Link Opportunities and Documents |
| This all seems obvious enough with relational databases. How can out-of-line links add value to XML documents, or for that matter, to HTML documents? Consider the kinds of professions that derive value from perceiving previously unnoticed relationships. For example, a financial analyst, reporter, or investor might base a recommendation to buy, a news story, or an investment on some interesting connection between a press release, a quarterly report, and a product description. Identifying these relationships creates useful information that people—for example, reporters, financial analysts, and especially investors—would pay for. |
| Existing collections of vague, unfocused Web links offer opportunities for the creation of more tightly defined and easily navigable link collections. Web rings, for example, connect multiple sites devoted to the same topic by merely adding a "here's one more site on the same topic" link to each site. Continuing to click this link on each site circles the entire "ring" of Web sites. While you may not find any new value to add to the over two hundred Web rings currently listed at http://www.webring.com for the Backstreet Boys, the same Web site lists Web ring categories ranging from electronic commerce to real estate to medicine—all fertile areas for locating diffuse collections of information that would provide more value if they were linked using XLink's power to impose coherency on a link collection. |
| Another inspiration is to consider information that an organization doesn't want linked, and why. A software company probably doesn't want links from their on-line software documentation to bug reports on each feature, but anyone reading the documentation would love to know about the reported bugs associated with a particular feature before they use it. The following example (for which thetitle andhref values were all real values as of theuriChecked date—I didn't make them up), links Microsoft bug reports with the corresponding Microsoft documentation, even though I certainly have no write access to any of these Microsoft Web pages. |
<?xml version="1.0"?>
<!DOCTYPE buglinks [
<!ELEMENT buglinks (buglink+)>
<!ELEMENT buglink (doc,bug)>
<!ELEMENT doc (title)>
<!ATTLIST doc xlink:type CDATA #FIXED "extended"
xlink:href CDATA #REQUIRED
uriChecked CDATA #IMPLIED>
<!ELEMENT bug (title)>
<!ATTLIST bug xlink:type CDATA #FIXED "extended"
xlink:href CDATA #REQUIRED
uriChecked CDATA #IMPLIED>
<!ELEMENT title (#PCDATA)>
]>
<buglinks xmlns:xlink="http://www.w3.org/XML/XLink/0.9">
<buglink>
<doc uriChecked="19991013"
href="http://support.microsoft.com/support/serviceware/word/wrd97/E9KMAZE48.ASP">
<title>How to: Add bullets or numbers to an existing list in Word 97.</title>
</doc>
<bug uriChecked="19991013"
href="http://support.microsoft.com/support/kb/articles/Q109/1/60.asp">
<title>WD: Numbering Command Renumbers Lines That Begin with Numbers</title>
</bug>
</buglink>
<buglink>
<doc uriChecked="19991013"
href="http://msdn.microsoft.com/scripting/default.htm?/scripting/VBScript/doc/vsfctTimeValue.htm">
<title>TimeValue Function</title>
</doc>
<bug uriChecked="19991013"
href="http://support.microsoft.com/support/kb/articles/Q117/6/99.asp">
<title>WD: TimeValue() Function Returns Incorrect Time Value</title>
</bug>
</buglink>
</buglinks>
|
| How will such an out-of-line link be implemented? There aren't many choices now, but there will be eventually. The most simplistic implementation would be to convert this to an HTML Web page titled "Documentation/Bug Report Links" that shows the pairs of links using plain oldA HREF one-way links. A more sophisticated approach would be a client- or server-side program that performs two steps: |
|
| Such a program wouldn't be challenging to write in Java, Perl, or Python. To use it, Meryl X. Customer might enterhttp://support.microsoft.com/support/serviceware/word/wrd97/E9KMAZE48.ASP onto the appropriate field of your MS-BugCheck Web page and then see the "How to: Add bullets or numbers to an existing list in Word 97" documentation page appear with a bug picture inserted at the top. When she passes her cursor over the bug picture, the text "WD: Numbering Command Renumbers Lines That Begin with Numbers (Oct. 13, 1999)" pops up; when she clicks the bug picture, she jumps right to that bug report. |
| Future browsers, whether on PCs or PDAs, will add more widgets that offer more possibilities when implementing links that take advantage of the XLink spec, and we can then build a more sophisticated application around the same buglinks document type. Ultimately, it's not XLink's job to worry about these implementation issues. Just as XML lets you identify certain words as glossary keywords and others as emphasized text with no concern about the fonts used to eventually render these elements, XLink lets you encode relationships independently of implementation details so that the same data can take advantage of future implementation innovations as well as current possibilities. |
Multi-way Links and Multiple Link Ends |
| In HTML, simulated two-way links are possible; twoA elements can both haveNAME attributes identifying themselves andHREF attributes pointing at each other. This isn't really a two-way link; it's actually two coordinated one-way links, but the end-user won't know the difference. |
| You'll certainly know the difference if you try to scale it up. If an on-line clothing catalog has links between shirts, ties, and pants that go well together, each link could be a three-ended link. (That is, while the buglink linking element listed two resources to be linked, anoutfit linking element could list three: pointers to a tie, a shirt, and pants.) Certain resources would take part in multiple three-way links; a white shirt and a pair of khaki pants can work with a lot of different ties. Every possible linkage must be traversable in both directions, because you want someone who just bought a robin's egg blue shirt to easily find the red tie with robin's egg blue stripes just as easily as you want someone who just ordered that tie to traverse the same link in the opposite direction to find the robin's egg blue shirt. |
| Implementing every link in every direction as an individualA HREF HTML element by hand would be a nightmare, especially when a particular item was added to or removed from the catalog. For now, when such links can only be implemented as a network of A HREF links, it still makes sense to store them as XLink extended links, because XLink-friendly software can automate the creation of all that HTML markup. With the relationships encoded as XLink elements, you could then take advantage of new implementation opportunities once they become available. |
Adding Value with Link Typing |
| link typing | The catalog example demonstrates another new advantage of XLink: one-to-many links. A single resource, such as a Web page describing a pair of khaki pants, can be linked to multiple destinations in a single link element. But wouldn't all the possible shirt and tie links from the khaki pants page be confusing? Traversing to all thirty Web pages for the various shirts that go with the khaki pants would be tiresome. |
| Fortunately, XLink provides a mechanism to make it easy for users to navigate such a wide choice of links: typed links. Thexlink:role attribute lets you assign any category you wish to any link, and an application can use this information to give the user clues about the potential usefulness of each link. For example, if the clothing catalog links had possible role values of "summer," "fall," "winter," and "spring," the application could display links from the khaki pants with icons to help the user narrow down exactly which shirts will best fill his needs. (For an excellent example of the use of link typing to aid navigation, see Tim Bray's on-line annotated XML specification at http://www.xml.com. His spec link elements each have link types of Using, History, Tech, Misc, or Example to identify the purpose of each annotation; his HTML version created from these links shows the different purpose of each link using red bitmaps of the single circled letters U, H, T, M, or E.) |
| In addition to the XLink'sxlink:role attribute, you can add any other attributes and subelements you want to your linking elements. They are, after all, XML element types that you're declaring yourself. XLink's ability to define more structure for these increasingly complex links makes them easier to control and, as with XML itself, leads to greater possibilities for automated processing into electronic products. |
| By easing this automated processing, the added structure makes it easier to scale up to larger systems and easier to maintain those systems—for example, to identify and handle broken links on a regular schedule, thus ensuring a more robust, higher quality product. Beyond system maintenance, more creative applications of XLink's features will encourage the development of new services and new products to build around these services. For example, added information in each linking element can provide filtering opportunities that let you charge different rates for different categories of access. Customized link sets related to specific content or subscription levels could be sold as different products from the same link collection, so that when Meryl X. Customer wants to upgrade from basic service to extended service, a simple change in her customer record could grant her access to all links (and the information they link to) instead of the subset that the marketing department selected as the basic access set. |
| Before the Web, many businesses made money selling publications with free information that, while not copyrighted, was still worth the money because of the convenience of having that information handy. Now that most of this information (for example, television listings, stock prices, or the U.S. tax code) is available for free in convenient form on the Web, an old-fashioned source of revenue is dwindling away. However, new technologies are creating possibilities for previously unheard-of revenue sources, and XLink is the most wide-open of these technologies. Once the W3C standard reaches Recommendation status, XLink will bring exciting new capabilities to the vast majority of us with little background in HyTime and hypertext research. Its power will let us take distributed hypertext to the next level of creativity and commerce. |
| Using XML in a Software Diagnostic Tool | Table of contents | Indexes | XML and related standards for data warehouses | |||