The Reference Browser: Support for Authors in Editing Links   Table of contents   Indexes   The Mapping Problem: From Data to XML and Back

 

The Document as Application: Issues and Implications

 Tony   Stewart
  Director of Consulting
  RivCom  945 West End Avenue
New York   New York  USA  10025
Phone: +1 212 222 4332
Fax: +1 212 662 6800
Email: tony.stewart@rivcom.com Web: www.rivcom.com
 
Biographical notice:
 
Tony Stewart is Director of Consulting of RivCom, a publishing services company based in New York City and Swindon, England, that specialises in developing and delivering XML-based interactive documents. RivCom was the first company to demonstrate XML documents being displayed in an industry-standard browser (at the WWW6 conference in 1997), and it continues to focus on developing innovative document-centric applications of XML technology.
 
While Tony has worked with XML for two years, he brings twenty years of past experiences that are directly relevant to these emerging technologies. Prior to joining RivCom, Tony spent ten years developing database application software, including five years as Vice President for Software Development at Riverside Software. There he led the development of a Windows-based database-management application for the administration of philanthropic activities that is used by two of the five largest charities in the world. Prior to that, he was for ten years an award-winning documentary and corporate filmmaker. Tony graduated from Yale University in 1977 with a degree in English Literature.
 
ABSTRACT:
 
The XML  (Extensible Markup Language) family of standards is changing the ways in which people interact with documents. Where traditionally documents have served as static snapshots of information, the combination of XML , XSL  (Extensible Style Language) and XLink  (Extensible Linking Language) will allow us to deliver documents that look and behave very much like interactive software programs. A document designer now has to consider not only how the document should look, but also how it should behave, and how it will fit into the information architecture of which it is a part. Similarly, an information architect who is designing a system must consider which goals of the project are best served by traditional software development, and which should be filled by XML -based interactive documents.
 
This presentation will introduce these issues at an abstract level, then discuss them in concrete terms by examining a project RivCom recently completed for an e-commerce client. This application allows merchants who are setting up a new on-line store to specify the products they wish to sell, and to customise the presentation of those products and the overall look of their store. Although the application behaves in many ways like a traditional software application, its architecture is based on a set of XML documents and style sheets that are assembled and styled on the fly in the user's browser. Thus, it demonstrates both the utility and flexibility of XML plus styling and linking languages, and also the range of issues that will face document and information designers as they adopt XML .
 

Introduction

 
The XML family of standards is changing the ways in which people interact with documents. Where traditionally documents have served as static snapshots of information, the combination of XML , XSL and XLink will allow us to deliver documents that look and behave very much like interactive software programs.
document designer
information architect
 

A document designer now has to consider not only how the document should look, but also how it should behave, and how it will fit into the information architecture of which it is a part. Similarly, an information architect who is designing a system must consider which goals of the project are best served by traditional software development, and which should be filled by XML interactive documents.
 ActiveX  
 Internet Explorer  
 RivComet  
 

At the WWW6 Conference in Santa Clara in April 1997, RivCom gave the first public demonstration of XML content being presented through the use of stylesheets in an industry-standard browser. RivCometTM , the technology used in that demo, has now been enhanced and is available as an ActiveX control running under Internet Explorer 4.0 and later.
 
RivComet has been used to deliver a number of XML applications for Shell International and other companies. One of these, code-named the "Storefront Demo", is the basis of this presentation.
 
This paper describes the application in action, then looks more closely at the XML structures on which it is based. Finally, it examines the issues that this kind of project raises, starting with the concrete "why did they do it this way?" and moving outwards to a generalised discussion of the strengths and weaknesses (and costs and benefits) of using XML -based interactive documents in an information architecture.
 

Application overview

 
RivCom developed the Storefront Demo for a Silicon Valley startup company that was seeking its first round of major financing. They had already assembled the usual slide show and business plan, but they felt that these in themselves were insufficiently persuasive. They asked us to build a prototype application that would demonstrate the feasibility and usefulness of the technology they planned to develop. Because they were still in a fund-raising stage, the demo needed to be put together quickly and inexpensively, and as a proof-of-concept prototype it did not need to resolve every technical problem. However, it did need to be sufficiently complete that a savvy viewer could see how the same core technologies, applied on a larger scale with sufficient funding, could realistically address the entire range of underlying issues.
 
As of this writing, both the name of the startup company and some aspects of their business plan are confidential. In order to explain this demo I will describe an imaginary business and explain how the Storefront Demo fits the needs of this imaginary business. In some ways my imaginary example doesn't quite make sense. However, please don't let that distract you from the purpose of this paper, which is to explore the software rather than the business that commissioned it.
 

The Hypothetical Business

 
"Sell-It" is a large retail company with stores in many different regions. Because shopping habits vary from region to region, Sell-It's managers have been encouraged to customise their stores' inventories, prices and aisle displays to suit the expectations of shoppers in that region. They can even invent different names for their stores. However, all Sell-It stores actually receive their inventory from the same central warehouse, where a single database keeps track of what items are available for sale at any given time.
 
Sell-It has just decided to establish a Web presence. In order to carry its business model onto the Web, Sell-It's central management will encourage each regional store to maintain its own Web site. Each regional manager will have the same flexibility with their web sites that they do with their physical stores: they can select the inventory, set pricing for each item, and customise the look of the web site in various ways. However, the core information displayed in each site will be drawn from the company's central inventory. Thus, each site will combine inventory information and a general layout that is shared by all sites, with customisations to both content and formatting that apply only to that specific site.
 
Luckily for Sell-It, their inventory information is held in a database that can export information in the form of XML structures. Thus, it is possible to build a Web application that will receive inventory information in the form of XML documents, and allow managers to customise the presentation of this information by recording additional XML information about it.
 

The Storefront Demo

 
In order to finance their expansion onto the web, Sell-It has decided to approach venture capitalists. The company will need to demonstrate the feasibility of their approach by building a Storefront Demo application. This will be a prototype designed to demonstrate that it will be possible, within a short time frame, to build a Web application that allows each store manager to:
  •  View the available inventory, including a rich set of information about each item (description, specifications, photo, distributor pricing, sub-components, etc.)
  •  Select the brands, distributors and specific items that will be sold in the storefront
  •  For each selected item, specify which types of information will be shown to the consumer
  •  Specify the default pricing policy for the storefront, and then optionally override the default price on an item-by-item basis
  •  Specify the colour and font scheme that will apply to the site as a whole, as well as the name of the store and its logo
  •  Display a mock-up of the resulting web site, and instantly see what will happen to the web site when they change any of the above settings.
 
Because multiple managers will be setting up web sites, the demo must show how several of them can log in and create customised storefronts simultaneously. And because the VCs  (Venture Capitalists) will be eager to invest in a hot new technology like XML , it would be helpful to use XML structures both for communicating with the central database, and as a publishing medium from which HTML  (HyperText Markup Language) Web pages can be efficiently generated.
 ActiveX 
 Internet Explorer 
 Netscape Navigator 
 

These requirements make RivComet an ideal solution. RivComet is an ActiveX control that runs within IE  (Microsoft Internet Explorer) 4 or later. (It is also available as a plug-in for Netscape Navigator 3 or later.) Its primary function is to receive XML and stylesheets, and generate HTML on the fly that can be displayed in the host browser:

RivComet at a Glance

 
 stylesheet 
 

RivComet uses XML to store data, and stylesheets to govern the presentation of the data and the interactions available to the user. One or more XML files downloaded from the server can be presented in multiple ways and combined with data stored locally or entered interactively by the user. The result is a fully-fledged application that runs on the client, with network traffic kept to a minimum. When the user is ready, these changed XML structures can then be uploaded to the server for processing.
 

Functionality

 
The Storefront Demo's opening page presents a menu divided into three main sections:
  •  Merchant Functions
  •  Shopper Functions
  •  Supplier Functions

The Main Menu

 
 
These sections correspond to the three types of people who will use the system. For the purposes of this talk, we will focus almost entirely on the Merchant functions - that is, the operations performed by the regional store managers who need to set up customised web sites.
 
Working through the menu items, the merchant performs the following actions:
  1.  Log into the system, thus identifying which store is going to be customised. (Each manager is associated with a specific store.)
  2.  Set the default pricing-markup for items sold in that store.
  3.  Specify the name of the store, its logo and the visual style that will be applied to it.
  4.  Drill into the product hierarchy available in the inventory database, and select which types of products will be sold.
  5.  For each type of product, specify which brands and distributors will be carried.
  6.  Display the available products and select the ones that should be sold.
  7.  Optionally, for each product, display a mock-up screen showing its full information and Include/Exclude categories of information on that screen.
  8.  Optionally override the product's selling price by specifying a different amount or percent markup to be applied to the wholesale price.
  9.  At any time, view a mock-up of the resulting HTML storefront that shows the result of all the above choices. This mock-up can be left open while working, in which case it changes dynamically as appropriate.
  10.  Save the changes back to an XML database on the server, so that they will be available for future working sessions.
 
All this functionality results from different presentational styles being applied to structured XML information. The inventory data and storefront configurations are stored in XML documents that are dynamically linked, manipulated and displayed in each phase of the demo according to both the user's actions and the style that is applied to them.
 
The stylesheets and formatting rules are written in a proprietary syntax developed by RivCom, as there is currently no W3C  (World Wide Web Consortium) Recommendation for the application of presentation and behaviour to XML content. However, RivCom is represented on the XSL Working Group and will adopt XSL syntax as soon as it has been approved as a W3C Recommendation. (This may still not resolve the problem of how to describebehaviour in a standardised fashion, but at least it will be a useful first step.)
 

XML Data Structures

 
The bulk of this paper will examine each feature of the demo in turn, and will discuss the XML that enables these features. Before doing so, it's useful to understand the relationships between the various XML files.
 
The information contained in the demo falls into two categories: inventory information that describes every product that is potentially available; and storefront configuration information that pertains only to the store whose appearance and inventory are being customised by the current user. There are three stores available for customisation. The user toggles between them via the login process as described below.
 
The general inventory information is stored in four XML files, which are shared by all three stores. (Although the files used in the demo have the extension "rpb", they are actually well-formed XML documents.)
  •  Brands.rpb - Contains a list of the brands and brand names in the inventory
  •  Distributors.rpb - Contains a list of the wholesale distributors, including some general information about each distributor
  •  CategoryList.rpb - Contains the product category hierarchy of every product in the inventory
  •  Catalog.rpb - Contains the entire available inventory, with pointers to entries in the other three files as appropriate.
 
The information specific to each storefront is contained in three more files:
  •  StoreCatalog.rpb - Contains the merchant's customisations to the Catalogue entries, such as the pricing markup that should be applied, and attributes specifying which portions of the product description should be displayed.
  •  StoreCategoryList.rpb - Contains the merchant's customisations of the category list, in particular, attributes indicating which categories have been selected for sale in this store.
  •  StoreConfiguration.rpb - Contains the merchant's customisations that apply to his entire store, for example, the default pricing markup and the visual style that should be used.
 
For each store, there is a separate set of these three files. These all use the same file names, so they are held in subdirectories called "Store1", "Store2" and "Store3", respectively. When the user logs into a particular store, the system retrieves that store's three configuration files and loads them into memory. (In fact the system holds all nine storefront configuration files in memory at once, and toggles between them internally. But the result is essentially the same as if it retrieved them on demand, except that the user does not have to save his work when moving between stores.)
 
Thus, there are a total of 13 files used in the demo, of which seven are active at any given time:

Files in the Storefront Demo

 
  Note:
$#160;
When RivComet loads an XML file, it parses it and converts it into a tree structure in memory. The file is then accessed and manipulated using functionality analogous to that specified in the XML  DOM  (Document Object Model) . Only when the user activates the "save" function is this internal representation converted back into an XML string that is then uploaded to the server for storage.
 

Log In and Store Selection

 
System login is performed through a simple HTML form.

The "Merchant Log-on" Window

 
 
The only permitted merchants are "Store1", "Store2" and "Store3". The system signals an error if any other value is entered. The password field is ignored.
 
While this form is hardly production quality, it does enable the system to associate a particular person with a storefront. Once the storefront has been specified (in this example, store 2), the relevant configuration files are loaded into memory, and the name of the currently active store is displayed on the main menu.
 

Set Default Pricing

 
This is another simple HTML form. The first field allows the user to set the default markup that should be applied to all products sold in the current store. The other values on this form are not used in the demo:

The "Set Pricing Defaults" Window

 
 
When the user changes the Default price markup, the value is dynamically written into an attribute of the root element of the XML document that contains configuration information for the current store:
 
<storeConfiguration defaultMarkupPercentage="20"
ID="storeConfiguration_store2">
 
From this location the system can quickly retrieve the default markup percentage for this store - a fact that will be very useful later on, as the default markup is used repeatedly in various calculations.
  Note:
$#160;
The XML examples shown in this paper appear to indicate that operations are performed directly on XML character strings. In fact, as mentioned earlier, all of the XML documents are parsed at the start of the demo, and manipulated from then on as tree structures in memory. Thus, the value that is changed in the above example is actually (internally) an attribute of a node of the tree. The documents, including any changes made by the user, are serialised back into XML when the user presses the "Save" button as described below.
  Note:
 ID  
IDREF
$#160;
Although the information in this system is currently stored and generated from an XML database, it was originally hand-typed because the prototype had to be built largely before the database was ready. Therefore, we adopted a practice of using large and useful values for the ID attributes. This way not only could we be sure that the ID's and IDREF's would be unique, we could also match their values by eye. This makes the XML samples in this paper considerably longer than they would be with system-generated algorithmic ID's, but also more useful to the developer.
 

Specify Name, Logo and Visual Style

 
This screen contains three sections. At the top the user can enter the name of their store. In the centre, the user selects from one of six supplied logos. At the bottom, the user selects one of three visual styles.

The "Storefront Options" Window

 
 
The user's choices are recorded in attributes within the same storefront configuration file that holds the default pricing-markup for the store. However, because these values do not need to be retrieved so frequently as the default pricing-markup, they have been placed in an element at a more normal location within the body of the document:
 
<storefrontDesignSettings
storefrontStyle_ID="storefrontStyle_2"
shopName="My store" merchantLogo="images/logo_toystore.gif"
storestyle="style3"/>
 
The result of these settings can be seen in the Shopping Functions section of the demo, which displays a portion of the Web site that will result from the user's choices. The Shopping functions can be open in another window while the user customises their site, and any customisation changes instantly take effect in the Shopping window. Here is a snapshot of the Shopping window's Home Page, reflecting the user's choices of "The Toy Store" logo and "Crazy tie-dye purple" colour scheme. (But note that the user's logo and naming choices clash with the inventory of the store, which consists entirely of computers. Just because we give users the ability to configure their Web site doesn't mean that they will make sensible decisions.)

The "Storefront" Window

 
 
As the user changes their logo or colour scheme choices, this window is refreshed accordingly. If the window is visible while the user is making the changes, the results are shown instantly.
 
The visual changes affect not just this Shopping home page, but also all of the Shopping pages that are launched from it. When the user switches visual styles, all open windows are updated with the new style.
 CSS, Cascading Style Sheets 
 

The ability to switch between pre-set designs is implemented using CSS  (Cascading Style Sheets) style sheets. The system includes three CSS sheets with identical structures and class names but different visual values. Each HTML Shopping page generated by RivComet references one of these CSS style sheets via a<Link> element (per the standard CSS mechanism). At any given time, all the HTML Shopping pages reference a single CSS style sheet - the one that contains the style selected by the user. When the user selects a different style, the HTML containing the<Link> element is regenerated in order to reference the appropriate new CSS style sheet. RivComet keeps track of which windows are currently open, and refreshes all of them dynamically as the user switches between styles.
 

Expand the Product Hierarchy and Select Categories

 
This function allows the merchant to navigate his store's available inventory in the form of a product hierarchy, which is displayed like an outline control with some extra features:

The "Select Product Categories" Window

 
 
This illustration shows the Computers portion of the inventory fully expanded, while the Sporting Goods at the bottom are largely collapsed. The hierarchy is in three levels: master categories, such as Computers, have buttons with a right-arrow next to them; intermediate categories, such as Desktops, have no special functionality; while the lowest-level categories, such as Systems, have a checkbox to their right. The user is meant to set the checkbox for each category of goods that he wishes to sell in his store. He should also click on the appropriate right-arrow button to specify the Brands and Distributors that he plans to carry. (This will be explained in the next section.)
 
The category hierarchy is stored in a set of mirrored XML structures that parallel the hierarchical information shown on screen. There is a master Category List which is shared by all stores, plus a version of it for each store that has an additional "selected" attribute attached to many of the elements. The master category list contains the text strings that are displayed on screen ("Computers", "Tennis", etc.); the per-store category lists contain the "selected" attributes that correspond to the user's check marks. Information from the two documents is merged by the controlling stylesheet at runtime using ID/IDREF matching.
element context
 

The style sheet also contains routines that create the outline behaviour by selectively expanding and collapsing portions of the information, based on the expand/collapse buttons the user has clicked. In effect, the style sheet binding rules associate presentational styles not only with the inherent nature of the XML information (its "context", in SGML  (Standard Generalized Markup Language) terms), but also with the actions that have been taken by the user during this session.
 
Here is a small portion of the XML that generated the list shown above. First, the master category list:
 
<storefrontCategoryList>
<masterCategory ID="cat_computers" name="Computers">
<category ID="cat_computers_desktops" name="Desktops">
<category ID="cat_computers_desktops_systems" name="Systems"/>
<category ID="cat_computers_desktops_accessories" name="Accessories"/>
</category>
...
</masterCategory>
...
</storefrontCategoryList>
 
Next, the matching portion of the individual store's category list, showing the "selected" attribute:
 
<storeCategoryList ID="storeCategoryList_Store1">
<storeMasterCategory ID="cat_computers" selected="yes">
<storeCategory ID="cat_computers_desktops" selected="yes" >
<storeCategory ID="cat_computers_desktops_systems"/ selected="yes">
<storeCategory ID="cat_computers_desktops_accessories" selected="no"/>
</storeCategory>
...
</storeMasterCategory>
...
</storeCategoryList>
 
At runtime, HTML is generated by combining the Name values from the master document with the Selected values from the secondary document to create a third, logical document that is displayed on screen.
 
Also at runtime, the Storefront home page (in the Shopping section of the demo) is regenerated to reflect the product hierarchy that the user has selected. For example, here is the Storefront home page that results when the user has (for some odd reason) set the "selected" check-boxes next to both Computers and Tennis Racquets:

The "Storefront" Window

 
 
The Storefront window is generated from the same XML structures that are used to generate the Select Product Categories window discussed above. However, where the style rule for Select Product Categories is to display all levels of the available hierarchy and use a check-box to indicate which ones have been selected for sale, the style rule for the Storefront is to display only those branches of the hierarchy where Selected is "yes". Thus, Computers and Sporting Goods are displayed on the Storefront, but not the third master category, Toys.
 
Again, if the Storefront window is open, it is dynamically regenerated whenever the user changes the values in the category hierarchy window.
 

Specify Brands and Distributors

 
When the user clicks on the right-arrow button next to a master category, a dialog appears containing all of the brands and distributors associated with products in this master category. The user has the option to click on those brands and distributors whose products he wants to carry in his store.
 
For example, here is the window which appears when the user clicks on the arrow to the right of the Computers master category:

The "Brands/Distributors" Window

 
 
The values entered in these checkboxes are used by the system when generating the Shopping web site. In the screen in which the shopper selects products for purchase, products associated with brands or distributors not checked in this dialog have their "buy me" button disabled. (In a production system, such items would probably not be displayed at all.)
 
The values entered by the user are stored in the same per-store configuration file that contained the "selected" attributes shown in the previous section. Here is the relevant portion of that file. The values in question are contained within the<storeDistributorSet> and<storeBrandSet> elements:
 
<storeCategoryList ID="storeCategoryList_Store1">
<storeMasterCategory selected="yes" ID="cat_computers">
<storeDistributorSet ID="cat_computers_distributors">
<storeDistributorPointer selected="yes"
	 distributor_ID="dist_computersUnlimited"/>
<storeDistributorPointer selected="yes"
	 distributor_ID="dist_digitaliaInc"/>
</storeDistributorSet>
<storeBrandSet ID="cat_computers_brands">
<storeBrandPointer selected="yes" brand_ID="brand_apple"/>
<storeBrandPointer selected="yes" brand_ID="brand_ibm"/>
<storeBrandPointer selected="yes" brand_ID="brand_hewlettPackard"/>
<storeBrandPointer selected="no" brand_ID="brand_dell"/>
<storeBrandPointer selected="no" brand_ID="brand_hayes"/>
...
</storeBrandSet>
<storeCategory selected="yes" ID="cat_computers_desktops">
...
</storeCategory>
...
</storeMasterCategory>
...
</storeCategoryList>
 
Following the same pattern established above, the user's information is stored in elements that contain a "selected" attribute and a pointer to master information contained in another file. In this case, there are two relevant master files: a list of distributors and a list of brands.
 Data Modelling 
normalisation
optimisation
 

Data Modelling Optimisations

 
In theory, we could have modelled the information so that the per-store configuration file only contained pointers to those brands and distributors that the user had actually selected. This would be the usual choice when deciding how to model information for storage, because it allows better long-term maintenance of the data. Had we followed such a "normalised" model, when another distributor became available, that distributor would only have needed to be added to one XML file, the master list of distributors, in order to be available for selection.
 
In practice, we decided to violate traditional storage modelling principles and optimise our model instead to facilitate data manipulation and display. We did this by redundantly including an element for each available distributor and brand in both the master file and each of the per-store selection files. This allowed the stylesheet routines that build the HTML to "drive" the process by walking through the per-store file, pulling in the names of both distributors and brands from the master files as necessary. The result is better performance and easier style-sheet building for the demo (which was our goal), but a system whose data representation is not optimal for storage of the information. In the case of this demo, we actually chose to store our data in this non-normalised fashion, because we did not want to deal with having to transform it when transferring it between the database and the Storefront. In a production system, however, it would be necessary to resolve this issue more robustly, probably by building more sophisticated transformations into the routines that export and import the data.
 
This is one instance of a similar choice that we repeated again and again throughout the system: we included redundant information in multiple files in order to facilitate development of the style sheets, at the expense of long-term data maintainability and/or ease of XML generation on the server side.
 

Select Products

 
When the user clicks on a leaf-node category name within the list of categories, the system displays a window containing a list of products in the category that are available for sale. Here's the list of Tennis Racquets that are available for sale:

The "Select Products" Window

 
 
Within this list, the user can select the "Sell?" checkbox to indicate that he wishes to sell a particular product within his store. Note that each product occurs once for each distributor that carries it; in this example, the first tennis racket is carried by two distributors while the second is carried by just one.
 
Other features in this screen include the ability to:
  •  Click on a brand/product name to see detailed information about that product in a pop-up window
  •  Click on a distributor name to see general information about that distributor
  •  Click on a distributor's "special" offer to see the terms of that offer
  •  Click on a retail price to see how the price was calculated for that product, and optionally to change the price calculation method
  •  Click on the "Show with Pictures" link at the top of the screen to toggle the visual display of the entire screen.
 
The information in this screen is assembled from a variety of XML documents, using the same techniques discussed earlier. For example, the Retail Price is calculated based on the warehouse price contained in the master catalogue, combined with the Default Markup entered on an earlier screen, plus price calculation information contained in the per-store catalogue. (This will be discussed in more detail below.) Names of brands and distributors are pulled in from their master files. Etc.
 
Two style sheets are available for this screen. The user can dynamically toggle between them by clicking on the "Show with Pictures" link at the top. Here is what happens if the above window is shown "with pictures".

The "Show Pictures" Mode

 
 
In the second illustration I have manually re-sized the window in order to display more information. Other than that, all we've done is assign a different style to the same assemblage of information and redisplay the window. This style sheet uses a different table format and displays more information about each product. The purpose here is to demonstrate to the target Venture Capitalist that with XML plus style sheets, we can perform useful transformations of the information entirely in the browser, without having to call back to the server to regenerate the page.
  Note:
 XSL 
$#160;
This feature is similar to an XSL technology demonstration that is available on the Microsoft XML Web site. We've repeated that demonstration within the Storefront Demo because a) it's impressive and b) it's not likely that our target Venture Capitalist has visited the Microsoft site.
 

Customise Display of Product Information

 
When the user clicks on a product in the "Select products" window, another window appears containing detailed information about the product:

The "Product Details" Window

 
 
This screen repeats many of the features that were available in the Select Products window, and adds a series of Include and Exclude buttons (shown with purple backgrounds in the screen shot). These buttons allow the user to customise how this particular product will be displayed. For example, here's how the screen changes if I press the "Exclude" button next to the Photo and Features information in the left-hand column, and the "Include" button next to the Consumer Options in the right-hand column:

"Product Details" After Customisation

 
style rule
 

The technique here is similar to the one that was used to generate the category hierarchy outliner format on an earlier screen. When the user clicks on an Include or Exclude button, the value of an attribute associated with that button is changed, and the screen is re-displayed. Since the style rules governing display of this screen take into account the values of the Include/Exclude attributes, the information shown on the screen changes dynamically.
 
As with the other screens, the information displayed here is combined from at least two sources: an entry in the shared XML Catalogue of all available products, plus customisation information entered in the per-store entry for this same product. Here is the relevant portion of the customisation information for the Wilson Tennis Racquet shown above. Note all the "showDetails" attributes, which correspond to the Include/Exclude buttons on the screen:
 
<storeProduct ID="product_wilsonOversizeRacquet"
showSpecifications="yes" showFeatures="no" showPhoto="no"
showWarranty="yes" showAdditionalDescription="yes"
showShippingWeight="yes" showMonitorDetails="no"
showDimensions="yes" showProductDescription="yes"
showPrinterDetails="no" showModemDetails="no"
showHardDiskDetails="no">
<storeProductDistributor
ID="prodDist_product_wilsonOversizeRacquet_dist_abcLimited"
selected="no">...</storeProductDistributor>
<storeProductDistributor
ID="prodDist_product_wilsonOversizeRacquet_dist_theSportsCenter"
selected="yes">...</storeProductDistributor>
<additionalDescription label="Our Comments"
text="This is the greatest racquet we've ever seen."/>
<Comment userOrReviewer="user"
text="I tried using this racquet but I always lost.
Avoid it at all costs."/>
</storeProduct>
 
Based on the customisation selections made here, the system also generates a Shopping screen for the same product. In the Shopper's Product Details screen, Excluded items disappear completely, with no placeholders showing where they would have been:

The Shopper's "Product Details" Window

 
 
As usual, if the user has the Shopper's Product Details window open at the same time that they are customising the display of that product's details, the Shopping window is dynamically redisplayed as they make each change. Thus, they can instantly see what the result of their changes will be.
 

Override the Selling Price

 
Several of the merchant screens display both the distributor's Wholesale price and the store's Retail price. As I noted earlier, the selling price is actually calculated by the stylesheet, based on the wholesale price from the inventory database, plus the default pricing markup entered by the user, plus any price override that they may have recorded for a particular product.
 
In order to change the price calculations for a product, the user clicks on that product's Retail Price in any of the Merchant screens which displays that price. Here's the window that appears if I click on the $89.70 selling price for the Wilson Tennis Racquet as supplied by ABC Distributors in the examples shown above:

The "Edit Price Markup" Window

 
 
In this window the merchant can see at a glance how the price was calculated, and optionally change that calculation. The values within this window change dynamically as the user clicks on the three possible pricing mechanisms (percent markup, amount markup, or default percentage markup). Here's what this same screen looks like if I click on the "Default" radio button, thus choosing to price this product according to the default pricing percentage, which was entered on an earlier screen:

"Edit Price Markup" After Changes

 
 
As you can see, the markup has changed to 20%, and the selling price was recalculated. When the user presses Close, the displayed values will be transferred to the product's XML element, and all screens that display pricing information for this product will be refreshed to show the current price.
 
The data underlying this feature consists of three components: the warehouse price, the default store markup, and the per-store catalogue customisation file. We saw in one of our earliest examples that the default pricing-markup for the store is kept in an attribute on the root element of the per-store configuration file:
 
<storeConfiguration
defaultMarkupPercentage="20" ID="storeConfiguration_store2">
 
The distributor's price for each product is stored in the shared inventory catalogue file. Here is the relevant portion of the Wilson Tennis Racquet entry, somewhat simplified, with the pricing information in the<price> element near the bottom:
 
<product ID="product_wilsonOversizeRacquet"
componentType="product" MSRP="100.00">
<productDistributor ID="xxx"
distributor_ID="dist_abcLimited" productAvailability="In stock">
<price basedOnQuantity="1" warehousePrice="78.00"/>
...
</productDistributor>
...
</product>
 
This XML fragment shows that within the Wilson racquet's product element, there is at least one<productDistributor> element, which indicates that the racquet is available from ABC Limited at a warehouse price of $78.00. If the product were available from other distributors, they would be represented by additional<productDistributor> elements in this same section.
 
The merchant's selling price calculation is recorded in the per-store Catalogue file. We've seen this file before, because it contains the Include/Exclude settings that govern display of information about each product. For any given product, this file also contains a<storeProductDistributor> element that holds the price calculation settings for that product when sold by that distributor in the current store. Here is what the settings for the Wilson Tennis Racquet would have looked like in the first example above. Note that the markup Type is "percent" and the markup amount is "15":
 
<storeProductDistributor
ID="prodDist_product_wilsonOversizeRacquet_dist_abcLimited"
selected="yes">
<spdprice basedOnQuantity="1" markup="15"
markupType="percent" retailPrice="xx"/>
</storeProductDistributor>
 
And here is how the values would appear after the user switched to the default pricing-markup, as shown in the second example:
 
<spdprice basedOnQuantity="1"
markup="15" markupType="default" retailPrice="xx"/>
 
The price calculations are performed by routines in the style sheet that act on the user's entries. Although we included a "retail price" attribute in the data files, we decided to ignore it in the actual implementation, so it contains only a dummy "xx" value. However, the style sheet could easily have recorded the calculated selling price in that attribute.
 
Similarly, any values not needed by the current calculation method are ignored. Thus, when the markup type is "default", the locally-entered markup value (in this case, 15) is ignored in favour of the default markup (20) that is retrieved from the master storefront configuration file.
 

View Storefront Mockup

 
As I have noted several times above, all of the actions taken by the Merchant that affect either the content or the presentation of the shopping experience are echoed in the screens of the Shopper section of the demo. If any of the Shopper screens are open when their configuration is changed, they are dynamically updated. This is one of the most powerful capabilities of the demo, and it is only feasible because we process the XML and generate the resulting HTML entirely in the browser, rather than having to call back to the server for each new HTML page.
 

Save Changes

 
The demo as described above is basically a stand-alone process, which runs well even when disconnected from the Internet. However, a critical component of its real-world utility will be our ability to upload the user's changes back to a database running on the server, and conversely, to reload files from the server when information upstream has changed.
 
For the purposes of the demo, we've implemented these two capabilities via the "Save changes" button on the demo home page, and the built-in Refresh capabilities of the browser. We also supply two versions of the style sheet, one that loads the files from a remote source, and another that loads from the directory in which it was started. This allows us to run the demo on-line when an Internet connection is available, and off-line for demonstrations such as conference presentations.
 
The "Save changes" feature was implemented via our equivalent of rubber bands and chewing gum: that is, technology just powerful enough to prove that this feature can be accomplished, without incurring the development costs of true two-way connectivity. There are two aspects to our implementation:
  •  Posting the changed files to the server, and
  •  Informing the database which files have changed so that it can re-import them into its system.
FTP
 

When the user clicks the "Save changes" button, we first write out all saved files to disk, replacing the files that we originally read when loading the system. This is accomplished by maintaining an internal "changed-but-not-flushed" flag for each file that we loaded into memory. It is then a relatively simple matter to query the status of each XML file in memory, and if it has changed since the last time it was flushed to disk, serialise it back into an XML string and write it back to disk. When working remotely we use standard FTP  (File Transfer Protocol) services to perform the upload, having first set up appropriate access rights to the server. Each file is written back to exactly where it came from, overwriting the older version of that file.
 ASP, Active Server Pages 
 HTTP, Hypertext Transfer Protocol 
 

The second step is to inform the database which files have changed so that it can re-import them and update the data store accordingly. We do this with an HTTP  (HyperText Transfer Protocol) "Get" message to a dummy URL  (Uniform Resource Locator) on the server. An ASP  (Active Server Pages) script associated with that URL parses our URL string, which largely consists name-value pairs indicating which files have just been written to disk. The ASP page passes the file names to the XML database application, which retrieves the changed files and loads them back into the database. The ASP script then returns a "page" to us that is actually just the string "success" or "failure". This is written to a virtual XML "document" in memory (that is, a tree structure for which there is no equivalent serialised XML file.) We then apply a style to this virtual document in order to display a confirmation message to the user.
 
Obviously a production version will require a more robust mechanism, especially taking into account the needs of multiple users and the fact that data on the server might change while the user is working on it. It would also be useful to implement multiple automatic or semiautomatic "save" points, rather than trusting the user to save manually before exiting.
 
Having said that, this "cheap and cheerful" save implementation has been quite successful at meeting the needs of our client. When changes are made to data in the XML database, the database generates a new set of XML files, which are loaded into the browser when the demonstrator presses the "Refresh" button. (Thus we show the effect of a server push, without demonstrating the notification message that would actually be required.) Conversely, changes made by someone running the demo can be saved back to the server via the "Save" feature, where they are stored persistently within the database and then retrieved the next time the user logs in and runs the demo.
 DOM, Document Object Model 
 JavaScript 
 

Runtime Technical Architecture

 
Except for HTML fragments embedded in the style sheet, the text files used in this application contain no HTML . All of the HTML required for rendering the pages is generated on the fly in the browser by the RivComet ActiveX control. Here is an outline of how RivComet generates the required HTML pages:
  1.  JavaScript 
     
    The initial HTML page contains RivComet, a small piece of JavaScript, and some HTML consisting of a single empty <DIV> element.
  2.   IE builds a DOM of the HTML , which is virtually empty. However, this is an important step because RivComet will make extensive use of the DOM .
  3.  The JavaScript tells RivComet the name of the first file that should be retrieved and displayed.
  4.  RivComet uses Windows services (the WinInet API  (Application Programming Interface) ) to retrieve the specified file.
  5.  The file contains some XML content, a set of stylesheets that associate formatting rules with the XML elements in that content, and the formatting rules themselves, using RivCom's own declarative style language.
  6.  RivComet parses the XML document and attached stylesheets and formatting rules, applying the rules to the XML in order to generate HTML that it writes dynamically into what was the empty <DIV> element in the DOM .
  7.   IE then re-renders the page with the now fully populated DOM , and a full HTML page appears on the screen.
  8.  The resulting page may contain buttons or hotspots which, when clicked, cause RivComet to launch additional windows. Here again, RivComet generates new HTML content dynamically, and inserts it into the DOM of the new window. As in the main window, this HTML content is determined by the application of style rules to the structured XML content in the initial file (or in combination with other files).
 
Not mentioned above, but quite apparent in the Storefront demo, is that the XML in question may be not one but a set of related XML documents. These can be retrieved at any point during the process based on the combination of XML content and the style rules that apply to that content. (Thus, a style rule can fire the equivalent of an XLink / XPointer combination, causing RivComet to request additional content from the server.)
 XSL  
 

The above architecture is extremely flexible and powerful. Of course, it uses a non-standard style language and an ActiveX control to achieve the desired results. But I believe (or at least hope) that within a few years XSL will support this level of interactivity, and the major industry-standard browsers will support interactive XSL style sheets. At that point we will be able to deliver the functionality embodied in the Storefront via a standardised syntax and non-proprietary tools. Until then, at least the data structures are in non-proprietary XML .
 

Issues and Conclusions

 

Why Did We Use This Approach?

 
This project came at the heels of several more "traditional" publishing projects in which RivCom had used XML and style sheets to publish on-line versions of various corporate documents and business and engineering models. Therefore, at the time that the client approached us about creating the Storefront, the RivComet technology was hot off the presses and fresh in everyone's mind. This "mind share" phenomenon must not be overlooked when evaluating the causes of any technology adoption (including, most obviously, XML itself).
 
The client's requirements were that the demo should:
  •  be data-driven
  •  feature an easy mechanism to modify the underlying XML structures
  •  interface with an XML database
  •  allow the user to see the results of his work on the fly
  •  mimic many of the features of a traditional HTML e-commerce web site.
 
After a short analysis, the client decided to try the document-based approach because:
  •  The underlying technology had already been used in our other projects, so the incremental software development effort would be quite modest (though this turned out not to be as true as we had anticipated)
  •  RivCom had extensive experience applying style to XML structures
  •  Java 
     
    The alternative approach, such as developing a Java application, could not have been implemented by RivCom within the time and budget available. This would have required the client to find another vendor, and no obvious candidates existed.
 

Would Traditional Software Development Have Succeeded?

 
Given the available time and the client's budget, I don't think that traditional software development could have delivered anywhere near this much functionality. However, if another company had been able to offer an existing Java toolkit for applying styling and behaviour to XML structures, then it would have been a more interesting question.
 

Was the Project a Success?

 
Very much so. As of this writing, the client has just received their second round of funding and we are discussing how RivCom can help them build the production version of this technology.
 

What Were the Major "Compromises"?

 
Inherent to this project was a need to make substantial compromises in robustness, scalability and maintainability in order to deliver the demo on time and budget. In fact the project ran late, because even with these compromises it raised considerable technical issues that had to be solved in order to deliver the required functionality.
 
Many of the compromises have already been mentioned above. From an overview perspective, the most important were:
  •  A simplified product category hierarchy, such that each product occurs in just one leaf node of the tree. This is not the case in the real world, where products can be categorised in multiple ways. The product hierarchy should reflect all those possible real-world categorisations.
  •  Redundant information in the various XML files, especially, ensuring that an element for each product appears in each store's catalogue file even if it is not carried by that store. This greatly improved performance when assembling information from multiple files into a composite display, because we could "drive" the assembly process from either XML file, and we could assume that for each product in either XML file we would find a matching entry in the other one.
     schema 
     
    (In fact the main "compromise" here was not the design of the XML used in the demo, but rather that we chose to store the XML in the database using this schema that had been optimised for display purposes. In a production system we are likely to use two different schemas, one optimised for storage and one for display, and implement appropriate transformation routines to move the data between them.)
  •  data modelling  
     schema 
     
    A pragmatic modelling style that is highly influenced by RivComet's strengths and weaknesses. In particular, we made extensive use of attributes rather than elements to hold the information. Also, when we knew that a given type of information would be retrieved frequently, we placed it in an element that had an easily referenced ID and/or was near the top of its document hierarchy, in order to minimise the amount of tree-walking that would be required.
  •  schema  
     
    Dedicated attributes for the kinds of information actually contained in the demo data set, without consideration to how this approach would scale when other kinds of data were added to the inventory. For example, the demo data contains attributes with names like "showModem" and "showPrinter" which are obviously irrelevant when the object in question is a tennis racquet. These names were invented by a human being based on an observation of the kinds of data being manipulated. Going forward, we will need an approach that uses a smaller number of generically named attributes, so that the system can automatically handle types of data that were not envisioned when it was first set up.
  •  A simple "Save changes" mechanism that did not attempt to post changed files directly into the XML database, but instead used FTP to park them on the server and HTTP to notify the database that it should retrieve them. This allowed the RivCom development team to work independently of the client's server development team, with only minimal co-ordination between the two.
  •  A decision to represent the entire inventory in a single XML file that is held in memory throughout the demo. This works for the hundred or so products in our demo, but is obviously insufficient for a real world system that would contain many thousands of inventory items.
  •  No consideration of multi-user concurrency issues.
 
Obviously, all of these compromises will have to be addressed when we move forward into a production version.
 

Issues and Implications

 
Moving beyond this particular project to a more general discussion, there are a number of key factors that come into play when deciding whether or not to use a document-based approach in a given application.
 

Costs and Benefits

 Java 
cost/benefit analysis
 

One must start with a traditional analysis of project costs compared to projected benefits. The document paradigm may be more or less expensive to develop than Java or VB  (Visual Basic) applications, and this difference must be quantified and factored into the decision.
 

Available Infrastructure

infrastructure
 

How will the application be distributed to its users? Can their desktops support the applications we develop? This is less of a factor within an enterprise that has standardised on a software desktop and has put mechanisms in place to distribute Java apps, ActiveX controls, plug-ins and the like. It is more of an issue when distributing to end-users across the web, where about the only thing we may be able to rely on is that they have a web browser. (But which version of web browser, and what support it contains for the required technology, is of course an important issue.)
 

Client Attitude

 
The client may be inherently more comfortable with one of the paradigms. Given the complexities of any development project, this comfort factor can be as important as the hard data points.
 

Suitability

 
Even taking into account the benefits of XML as an information interchange and transformation medium, there are overheads involved in converting information from one form to another. The more closely the source information matches the XML format, the easier it will be to use a document-based architecture.
 
Similarly, not all data lends itself to a document-like or browser interface. It's important not to force the document paradigm onto data that would be best displayed via another tool or in another medium.
 

User Attitudes and Expectations

 
The support within browsers for interfaces that are similar or identical to more traditional software applications is rapidly improving. But to the extent that a document-based architecture results in a different interface than what users are accustomed to, one must evaluate not only the interface's absolute fitness-for-purpose, but also the user community's likely response to it. (Like the earlier point about client expectations, this is a subjective factor that can be just as important to a project's success as other, more quantifiable considerations.)
 

Security

 security 
 

The security model available in traditional software applications is quite different from (and generally more robust than) the browser-based security model. It is important to consider what level of security is required for the application, and whether it can be reliably implemented within the browser environment.
 

Links to Other Information

 
On the other hand, URL -based hyperlinking is an extremely useful and easy-to-implement mechanism for linking the information in an application to other corporate information. Here the fact that we are running in a browser with access to the Net gives document-based applications a real edge over other forms of application delivery. Links can be implemented in the initial delivery of the application, or added later by revising the style sheet and adding another layer of XML documents that map the content in the application to relevant external resources.
 

Benefits of Going Off-Line

 
Finally, one of the key benefits of the document-as-application approach is that the user is able to interact with his information in a richly structured way while disconnected from the data source. This is especially useful for dispersed enterprises, where it may be appropriate to send mini applications to vendors or suppliers who do not have security rights to access the enterprise's databases (or even their Intranet). The package of XML information and a governing style sheet can be sent by email. The recipient then interacts with it in his browser just as if it were a traditional software application. When finished, the application serialises the data that the user has just modified as another XML document, which is emailed back to the sending organisation where it can be parsed, interpreted and loaded back into the central database. This is an extremely powerful processing model that hasn't existed in any meaningful way until now.
 

Conclusion

 
The world of publishing is changing rapidly. Soon, it will be common to deal with interactive documents that behave just like the applications that were written in Visual Basic or Java just a few years ago. Whether or not this is a good idea for a given project is not clear-cut. The decision depends in part on the nature of the information being published, in part on the facilities and constraints of the project's infrastructure, and in part on the attitudes of the clients and users.
 
As the XML -related standards gel, as the built-in support for XML -based interactivity in browsers increases, and as users become more accustomed to this software paradigm, the boundary between documents and software applications will blur to the point of disappearing. RivCom's Storefront Demo may be one of the first XML -based interactive document-as-applications that we know of, but it surely won't be the last.

The Reference Browser: Support for Authors in Editing Links   Table of contents   Indexes   The Mapping Problem: From Data to XML and Back