Metadata deployment for publishing environments   Table of contents   Indexes   Validation Is Good

 

The student and the mechanic - how XML enables architectures to solve real-life document delivery requirements

 Colin   Mackenzie
  Projects Business Unit Manager
  Database Publishing Systems Ltd  608 Delta Business Park
Swindon   United Kingdom  SN5 7XF
Phone: +44 01793 512515
Fax: +44 01793 512516
Email: Colin.Mackenzie@dpsl.co.uk
 
Biographical notice:
 
Colin Mackenzie manages the Projects Business Unit at DPSL of Swindon, taking overall responsibility for all projects undertaken. Colin has recently been working with the UK Ministry of Defence as part of a project to help define the future strategy for web-based delivery of XML for IETMs  (Interactive Electronic Technical Manuals) . Colin was previously employed by DPSL as a project manager/consultant working on Intranet and CD-ROM document delivery and integration projects.
 
Prior to working at DPSL, Colin worked for many years as a programmer, then development manager for Miles 33. He has extensive experience in the field of technical documentation, newspaper publishing, desktop publishing, electronic publishing, and pre-press systems with expertise in search, printing, and Internet technologies.
 
ABSTRACT:
 
This paper will illustrate how the use of XML on both the client and server in a web environment can enable publication managers to meet the delivery requirements of different users. Specifically, the presentation focuses on the use of XML in an n-tier architecture, and whether the XML is best placed on the server or delivered to the client. The presentation at XML Europe 99 will include demonstrations of XML delivery applications to illustrate the points raised in this paper.
 

Introduction

 
In order to examine the arguments for and against XML being delivered to the client, we will consider two example scenarios of presenting data in a Web environment. The first scenario is of a student carrying out research using an on-line reference book and the second is of an engineer assigned the job of carrying out maintenance on aircraft equipment. Clearly, both users have different requirements in terms of how they access the data they require and the ways in which that data is presented. This presentation will also discuss the potential system architectures necessary to support the different requirements of the users and of the published source data.
 

System Architecture

 
Before we investigate the user requirements, it is important to define the architecture of the systems being discussed. Recent development methodologies dictate that complex systems should be developed in three tiers.

Three Tier Architecture

 
 
An example of a three-tier business system would be a stock system separated into the following layers:
 
  •  Data Layer - a relational database on a server containing stock information
  •  Business Process Layer - components on the client and server performing arithmetic calculations on the stock information and integrating with e-commerce systems
  •  Presentation Layer - Graphical User Interface on the client used to enter and present the stock information
 
This multi-layer approach is intended to create a system that is scaleable, supporting multiple servers and databases with the maximum flexibility. Typically, the business process layer would support most of the system functionality enabling developers to create both light (e.g. a web client) and fat (e.g. a Visual Basic or C++ client) front-ends in the presentation layer.
 
A more detailed analysis of the Business Process Layer would show that some of the components within the layer would be focused on servicing the data in the Data Layer, while other components would be focused on servicing the GUI  (Graphical User Interface) in the Presentation Layer. This effectively creates an "N" tier architecture.

"N" Tier Architecture

 
 
This paper will now discuss where the components providing the required functionality are placed in the various tiers for an XML web delivery system, and where the use of XML directly enables these components. XML provides many advantages in delivering documentation, but also may require additional development in the various tiers, including:
 
  •  Data Layer - traditional relational databases do not suit deep structures well, as only meta data may be successfully captured and manipulated in relational fields, with the main XML stored as a BLOB  (Binary Large Object) . New features are being added to relational databases and new object-orientated databases are maturing allowing direct addressing of the XML data contained in the server (providing data layer support for linking mechanisms and the extraction of document fragments). Until these databases are widely available and implemented, extra processing will be required in the data focused components of the Business Processing Layer.
  •  Business Processing Layer - an example of functionality supported by both data and GUI focused components would be the resolution of XLinks (these components may be distributed between the client and the server). Building "dynamic" documents from document fragments extracted from the Data Layer would also be contained in this layer.
  •  Presentation Layer - the web browsers currently in use in both academic and corporate environments do not natively support the rendering of XML. This means that either the XML must be transformed to HTML by GUI focused components in the Business Processing Layer or the web browser must be enhanced to directly support the XML (effectively migrating the GUI focused components to the client). The choice of approach to take to this problem depends on a number of factors and this debate forms the core content of this paper.

Distribution Of Components Between Client and Server

 
 

User Scenarios

 
Having described the system architecture and some of the general issues raised when supporting XML, this paper will now examine the functional requirements of two distinct classes of users and the effect of those requirements on the use of XML on the client and the server.
 

Academic User

 
In this scenario, a long reference book is stored on a commercial publisher's server in XML and is accessed by a student in a university. One potential architectural constraint would be that the publisher couldn't dictate the client machine configuration, browser type or browser version of the end user (e.g. PC/Mac, Internet Explorer/Netscape). The student wishes to conduct a search based on words that occur in a chapter title or a summary, or perhaps on chapters written by a particular author. In addition, they wish to browse sections relating to a certain classification (for example, all sections relating to 17th Century philosophers). In terms of presentation, the user requires that a table of contents for the entire reference book is presented in a separate frame to provided easy navigation to the entire content. Furthermore, links to other sections enable the student to call up related information. Having traversed a link, the table of contents provides further orientation support to the user through feedback of their new location in the content.
 
For this scenario, an architecture that concentrates the processing on the server could be considered.

Server Focused Architecture For Light Client

 
 
With this architecture, the XML will remain on the server and will be transformed by GUI focused components prior to delivery to the client. When the student requests a chapter or section, the resulting XML can be combined with an XSL stylesheet to create HTML. This processing could be accomplished using a transformation engine like James Clark's XT, the Microsoft XML/XSL control, or by using a transformation scripting language (e.g. Balise, OmniMark, Perl). Using a scripting language will allow complex manipulations of the XML to provide the student with additional functionality (e.g. gathering all references found in the chapter into a reference list). As only HTML is being delivered to the students web browser (possibly supported by some JavaScript or small Java applets), the publisher can be sure that the application will perform predictably in institutions with different client configurations. Minor differences in HTML support in various browsers could even be dealt with by applying different stylesheets or transformation scripts.
 
The student would enter queries into a simple HTML form. The query would be sent to the server to be processed by a data focused component, which would query the database and return the appropriate XML result set to the GUI focused components. As the XML element and attributes have been retained in the server, searches can be restricted to areas of the document to enable the student to locate the relevant content quickly and easily. One example would be to find all summary elements containing the word "Religion" in the summary title where the summary is in a section relating to 17th Century philosophers). As mentioned previously, most mainstream databases and search engines do not currently support XML structured searching so, unless a more specialist search engine or database is used, additional processing of search results would have to be performed by the data focused components of the Business Process Layer. For the student application, this additional processing could include the post processing of a SQL search on relational fields holding document meta data (e.g. Select content FROM sectiontable WHERE century="17"). After the relevant sections are returned from the first search, the results set would then be processed again to find the word "Religion" in the title element. Obviously, an XML aware database would make this process much more efficient. The resulting data set could then be combined into one compound XML document prior to being transformed to HTML and delivered to the client.
 
The last requirement to be addressed is for a table of contents for the entire reference book allowing easy navigation to the entire content across chapters and sections. This requirement illustrates where server-side processing is required due to data volume (even when a fat client with XML awareness is available). In the case of the long reference book example described in this scenario, the table of contents for the whole book would have to be created on the server as the whole book cannot be downloaded to the client due to bandwidth restrictions. Further, if the book is "chunked" into chapters or sections prior to being loaded into the server database, the table of contents would have to be created prior (or in parallel) to the "chunking" process. The table of contents would be created by a transformation script applying a stylesheet to the whole document to produce a document containing only the chapter/section titles with links to the relevant chapter/section chunks. This table of contents could be transformed to Dynamic HTML immediately or to XML then to HTML at run-time according to requirements. When the user clicks on a table of content link, a request would be sent to the data components in the Business Processing Layer on the server (via the GUI components) which would return the relevant XML section chunk from the database. When a user follows a link in the content, the table of contents' view would be updated to reflect the user's current position. This could be accomplished by a simple piece of client-side JavaScript that is triggered by the event of clicking on the link or on the loading of the new document.
 

Engineering User

 
A very different scenario is provided by the case of an aircraft maintenance mechanic who has been assigned the task of carrying out routine maintenance on an aircraft part. Here, the data is stored as a collection of short data modules on a local Intranet server. These data modules do not have a "book" structure of chapter by chapter and can be accessed in any order according to the user's requirements. Links are provided across data modules and between data modules and graphics (including graphic hotspots). He needs his IETP  (Interactive Electronic Technical Publication) to provide all of the information required for his given task. This information would include descriptive, parts, and procedural data accessed by traversing through a structure representing the aircraft's systems and sub-systems. In order to complete the task safely, the engineer needs the IETP to present warnings, cautions and notes in such a way that he has to acknowledge that he has read them before carrying on to the next step. The content structure, style, layout, and behaviour have all been defined as a standard that must be followed. Traditionally, these standards have been met by using proprietary fat clients accessing data on CD-ROMs. Web technology allows the data to be updated and distributed in a more timely fashion but the standards set for the data and user interface cannot be relaxed.
 
For this scenario, an architecture that concentrates the document processing on the client could be considered.

Client Focused Architecture For Fat Client

 
 
In order to provide the user with the level of interactivity required by the user and mandated by the IETM  (Interactive Electronic Technical Manual) standards, the XML structure must be supplied to the fat client application. Fortunately for the publications manager, the mechanic is working within a corporate IT infrastructure where the client configuration will be standard for all users. This means that the user requirements, not the need to support a variety of client hardware and software versions, define the system architecture.
 
The source data for the mechanic is held as a large number of small data modules containing different content types (descriptive, procedural, etc), each with its own variation of the core DTD. This modular nature of the data provides many advantages and some challenges for the system developer. As each data module is of a small size there is no need to "chunk" the content or to pre-process the data. In this architecture, each data module is extracted from the database on-request. The data module is then delivered to the client application as XML where it is loaded into a DOM  (Document Object Model) to provide an efficient way for the client application to interact with, and manipulate, the document's structure and content.
 
As the data modules contain independent information and can be combined together in many ways, there is not a concept of a book-like table of contents across data modules. This means that the small table of contents for each data module can be generated at run-time on the client by processing the DOM built from the data module (exposing all title elements via an XSL stylesheet). If user requirements demand that further table of contents are required (e.g. table of figures, table of last changes), then they can easily be generated by reprocessing the DOM .
 
Once a data module has been loaded into the DOM , the client-side application can render the whole data module or any element(s) contained within the data module into the relevant application panes. Further, the client can capture user events such as scrolling or "stepping" through the content and perform the relevant action. For the mechanic, this would mean that the application would trap the events generated as the user steps through a procedure containing step by step information to perform a maintenance task. The application can then re-style and extract information from the DOM dynamically to perform the relevant GUI actions including:
 
  1.  highlighting the current step in red in a frame showing all steps to show the context of the current action that the user must perform (see figureMAC-011 )
  2.  rendering the current step in a separate frame using a style with a larger point size to ensure that the mechanic performs the current step (see figureMAC-011 )
  3.  the title of the graphic relevant to the current step is extracted from the DOM and painted into a title pane above the graphic (see figureMAC-011 )
  4.  pop-up a window containing the styled contents of a warning or caution element that requires acknowledgement when a mechanic enters a step containing information on a task that is potentially hazardous (see figureMAC-012 )

GUI Rendering Steps and Graphic Title from DOM

 

GUI Rendering Caution from DOM

 
 
The complex functionality described above is mandated by the relevant standards body, and ensures that the mechanic can complete his task efficiently and safely. This complex functionality can only be provided by delivering a structured document to the client application.
 

Summary of Client versus Server Influencing Factors

 
When defining an architecture for your XML delivery system you should consider the following factors:
 
  1.  Evaluate Hardware and Software Limitations (memory, operating systems, browser versions, network bandwidth)
     Do your users have the similar client configurations with a modern and powerful enough client environment to run a complex client application? If they do not, and you do not have the ability to mandate a change to their configuration (as would be the case in most commercial publishing projects), you should consider using an architecture with a light client. As this paper has demonstrated, even with server-side processing it is still advisable to store and manipulate the source data in XML even though HTML is delivered to the client.
     You should also consider the number of users that will be accessing your server concurrently. If the client limitations mentioned in the previous paragraph mean that you cannot distribute your processing, you should ensure that the processing load can be handled by your server (or servers if required).
  2.  Examine the Data Requirements
     Does your data consist of one (or more) large documents or chapters? If the data size is too large to provide adequate download times, you may have to process the data on the server, either to create smaller chunks or to create a "dynamic" document comprised of small chunks from various large documents. This process could be static (documents are pre-processed and chunked once according to requirements) or performed every time the data is accessed. The choice between static and run-time may be influenced by the frequency of the data updates and of the number of users. This choice will also be influenced by the functional requirements.
     If your users have light functional requirements, you may wish to transform from XML to HTML on the server. This still has many advantages over storing your data as HTML, as the structure provided by the XML allows you to manipulate, combine, and style documents according to user requirements.
  3.  Examine the Functional Requirements of Your Users
     Does your user expect a high degree of interactivity in the client browser? Complex functionality including user interactions with the data being displayed (e.g. warnings or graphics popping-up when the relevant content scrolls into view) can best be supported by developing components which interact with user events and the structure of the information via an XML DOM . These components would then generate the HTML required for rendering the content according to the users requirements. Even once Internet browsers support the rendering of XML using XSL natively, a rich functional layer on the client utilising and manipulating the XML structure will still be necessary to provide the sort of advanced functionality described in the mechanic user scenario.
 

Conclusion

 
By developing your solution using an n-tier architecture containing a series of components with well defined interfaces, it should be possible to replace components, or migrate components from the server to the client. This migration could occur as your user requirements change, or as the need for custom developed components to process XML diminish due to increased support natively within browsers.
 
XML provides the structural backbone for all of the potential architectures defined in this paper. Whether you provide data to students or to mechanics, by choosing XML, you enable the possibility of developing a highly functional, fat client for your web applications either now or in the future.

Metadata deployment for publishing environments   Table of contents   Indexes   Validation Is Good