Technical theory &, practice   Table of contents   Indexes   From markup to object model

 Processing Model 
Real-Time
Software Engineering
 

An extensible model for real-time XML processing

Rosen, Dan
 
 Dan  Rosen
 Brown University
 Providence  
Rhode Island
 USA 
Brown University,  Department of Computer Science,  115 Waterman St., 4. Fl.
Providence  Rhode Island  02906 USA
Phone: +1 401 863 7600 Fax: +1 401 863 7657 email: dr@cs.brown.edu web site: www.cs.brown.edu/people/dr
 Biography
 Dan Rosen — At the time of this paper's writing, Dan Rosen is a senior at Brown University, and a candidate for honors from the Department of Computer Science in May, 2000. His senior thesis work (including this paper) focuses on technical and theoretical issues arising from XML's use in software engineering, and conversely, software's use in markup design. In his thesis papers, he covers processing models for XML, an application framework using those models, and a real-time XSLT implementation constructed on that framework.
OEB
 
During his vacations from Brown, he has served as a developer for Cambridge Information Network (San Francisco), creating an XML-based web application system in JavaScript to interoperate with the legacy systems in place; and for Yomu (San Francisco), as lead architect for their XML-basedOEB systems.
 Abstract
Separation of syntax from semantics
 XHTML 
 
One of the uniquely advantageous aspects of XML is the separation of XML syntax from the semantics of XML documents. Standards such asXHTML
 MathML 
 
 http://www.w3.org/TR/xhtml1
,MathML
 XSLT 
 
 http://www.w3.org/TR/REC-MathML
,XSLT
 XLink 
 
 http://www.w3.org/TR/xslt
,XLink
 MathML 
 XHTML 
 XLink 
 XSLT 
 
 http://www.w3.org/TR/xlink
and others all rely on this separation. The XML specification intentionally leaves this open, so that any language described in XML (thus inheriting the common XML syntax) may have its own unique semantics, behavior, or functionality. Most XML applications, such as those mentioned, require special processing: for example,XHTML elements are rendered,MathML expressions are rendered or evaluated,XSLT transformations are executed, andXLink s are reified or embedded. This processing is performed by software programs conformant to each particular XML application.
Problems with current software engineering techniques
 SVG 
 XHTML 
 
With the recent growth in the number of XML applications and conformant software, many common problems have begun to present themselves. Most software applications (web browsers, for example, which are capable of processingXHTML ) use home-grown software engineering techniques to translate from XML content to program behavior. The lack of uniformity between these various techniques is only a minor problem, however. The major problem is that, since all XML applications share the same syntax by definition, there is no fundamental difference between XML applications visible to a processor withoutprior knowledge of each particular application. It's safe to say that a processor of one XML application will have knowledge about that application, and it's relatively safe to assume that a processor of two (possibly related) XML applications will have knowledge about both applications, as well as well-defined interactions between them (for example, web browsers will soon be able to handleXHTML with embeddedSVG
 SVG 
 XHTML 
 XLink 
 XSLT 
 
 http://www.w3.org/TR/SVG
). But if a processor of an arbitrary number of XML applications must assume prior knowledge ofall the interactions between each (for example,XHTML withSVG , embedded throughXLink s, transformed throughXSLT ), the programmers of such a processor would have a nightmarish job. Conformance to one application has proven a difficult enough task for programmers in the past; conformance to several in the same software would prove impossible given current techniques.
XML processing models
 
This paper will detail current work onXML processing models , a new approach to this problem, combining conventional wisdom from both the markup and the software engineering communities. Variations on abstract processing models based on principles of semistructured data will be discussed, as well as various techniques to implement these models in software. The concept of anextensible, real-time processing model will be introduced, and finally, as this is a very young field of research combining two historically distinct disciplines, many open questions will be posed.
 

Introduction

Convergence of markup and software engineering fields
 
When two different communities or fields merge, as the markup and software engineering communities have, there is always an extended enlightenment period prior to the solidification of the new field, as each learn each other's principles and begin to apply them. This enlightenment period is generally accompanied by a significant amount of frustration, as the vocabularies, concepts, and axioms of each group collide. In recognition of this (and for this paper to make any sense to its audience), some standard vocabulary and other preliminaries must be assumed.
 

Vocabulary

 
Application "Application" refers specifically to an XML document type. For example, in theMathML specification: "MathML is an XMLapplication for describing mathematical notation."
 MathML 
 
Software Application "Software Application" will represent our use of the word "application" in the XML specification: "It is assumed that an XML processor is doing its work on behalf of a module, called theapplication ." This distinction is necessary to disambiguate between XML document types (or, more abstractly, modes of employment), and specific programs which handle these document types.
 
Semantics "Semantics" is a well-defined term already, but for clarity, it is the set of intendedmeanings andbehaviors of a type of document and its constituent pieces. For example, the semantics of anXHTML  <p> element include the basic ramifications of just being an XML element, the rendering behavior in the context of different browsers and stylesheets, the significance of being a paragraph, and so forth. The semantics of XHTML as a whole are defined in these terms.
 XHTML 
 
Interpretation Likewise, "interpretation" of semantics simply refers to the correct processing of behaviors or recognition of meanings in the semantics of an application. In other words, a conformant software application is one which correctlyinterprets the semantics (correctly rendering XHTML, for example).
 
Object Model An "object model" is a representation of data, which may be either conceptual or in software. An example of a conceptual object model for XML is theInfoset
 DOM, Document Object Model 
 Infoset 
 
 http://www.w3.org/TR/xml-infoset
, and an example of a software object model is theDOM
  http://www.w3.org/TR/REC-DOM-Level-1
.
 
Processing Model Finally, a "processing model" is a scheme, either conceptual or in software, for interpreting the intended semantics of one or more XML applications. A processing model uses an object model to interpret XML data. This is, of course, a broad definition in many ways, and leaves a lot up in the air. A lot of definitions need to be filled in. When multiple applications are present in a single document (XHTML with embeddedSVG , for example), is that considered a unique application? When two applications could be processed together in several different ways, i.e. when there is ambiguity, how is that ambiguity resolvable? These are the type of questions which will be approached by this paper.
 SVG 
 XHTML 
 

Outline

 This paper will make a stark distinction between abstract (conceptual or theoretical) processing models, and models implemented in software. In reality, the implementations are tied directly to the theoretical models, one being of little use without the other; but until more vocabulary and experience become common to both the markup and software disciplines, both will be easier to comprehend if a certain "safe distance" is maintained between each.
 
  1. Abstract processing models will be discussed first, covering the circumstances in which they came about, the evolution of basic models, and the difficulties in creating truly general models for XML processing.
  2. Implementations of processing models will then be discussed, covering the object-oriented design techniques used to represent the concepts presented in abstract models in software.
  3. Finally, the "Extensible Real-Time Processing Model" will be introduced, and future directions of markup technology and software engineering will be discussed.
 

Abstract processing models

 In recent months, a significant amount of thought has been put into the development of abstract processing models. It is useful to consider abstractions in general when we don't want to let ourselves be bogged down in the details and quirks of concrete implementations, when we want to force ourselves to the rigor which theory and mathematics demands, or when we're bored. All three are valid reasons (although the latter is more commonly solved in front of the television), however, the development of abstract processing models for XML is out of necessity. The range of real-world applications of XML is becoming so broad, and the complexity so high, that the theoretical limits of XML processing are being pressure-tested.
 

Motives for developing abstract models

 XSLT 
 
The complexity ofXSLT is the main motive for developing abstract processing models. It is aTuring-complete language, meaning it is of the highest computational complexity we know of. This has many implications, but there are two main, concrete concerns:
 
  • XSLT is able tobreak or mangle the semantics and syntax of any XML. This is theW3C 's main concern as it develops applications such asXLink (XLink s should, by definition, never be broken, as they represent important semantic information about document/text relations).
  •  W3C 
     XLink 
     XSLT 
     
  • XSLT is often used as as stylesheet language for documents, but in this process, documents may be rearranged and text may be generated or disappear. Given an XML applicationX , two stylesheetsa andb may exist to transform documentx of typeX intoXHTML andXSL-FO
     XHTML 
    XSL-FO
     XSLT 
     
     http://www.w3.org/TR/xsl
    , respectively. A user of the transformed documenta(x) may have a very different view ofx from a user of the transformed documentb(x) . For the two users to share information aboutx , they must be able toreverse the effects of their transformations to extract information about the original document. Since XSLT is Turing-complete, this is an impossible problem in the absence of an intelligent work-around.
  •  XSLT 
     

    XSLT transformation reversal

    Transformation reversal
     XSLT 
     
    The concept of reversing, or tracing backXSLT transformations turns out to be the solution toboth concerns. The problem of comparing the same document with different transformations yields this solution, but the problem of identifying semantic and syntactic breakage can also be solved in this way, by tracing transformations back to the source, to compare and see whether anything in the original source became broken in the result. And although the computational complexity ofXSLT does not permit us to perform this operation given only the result and transformation documents, we can be successful by keeping a bit of extra information around.
     XSLT 
     
    At every step ofXSLT 's processing, there is acontext node in the source document, and atemplate rule which is applied to that context node to produce aresult tree fragment containing one or more nodes. Where the data appearing in the result tree fragment (whether it be directly from the context node, from a combination of nodes, or entirely generated) is entirely irrelevant; the point is that given the combination of context node, template rule, and node in a result tree fragment, the three are unambiguously related. This relation may be recorded in a variety of ways, one of which will be described shortly, but the key point is that with this mapping, a node in a source tree may be re-located given only the node in the result tree.
     

    Preserving transformation traces

    STG
     
    Steven DeRose, of the Brown UniversitySTG , and the author, propose a method of usingXLink s to preserve the transformation relationships. AsXSLT is processed, the object model stores the relationships and then serializes them in this form:
     XLink 
     XSLT 
     
    <!ELEMENT transform-triple (source, template, result)>
    <!ATTLIST transform-triple
    xlink:type (extended) #FIXED "extended"
    >
    
    <!ELEMENT source EMPTY>
    <!ATTLIST source
    xlink:type   (arc)     #FIXED      "arc"
    xlink:role   NMTOKEN   #FIXED      "source"
    xlink:to     CDATA     #REQUIRED   <!-- URI reference to node,      -->
    <!-- possibly including XPointer -->
    >
    
    <!-- (template and result elements defined similar to source element) -->
    
     For example:
     
    <transform-triple>
    <source xlink:to="source.xml#sourcenode"/>
    <transform xlink:to="transform.xsl#templatenode"/>
    <result xlink:to="result.xml#resultnode"/>
    </transform-triple>
    
     XLink 
     
    This is a simple, lossless serialization of the relevant information, but is a bit naive. Although it inherits the robustness of theXLink model, it requires a lot of processing to be of any later use. Also, template parameters must be associated with the template role (they may be traceable from the context node in the source, but could also be specified explicitly using something like <!ELEMENT template (param*)> in the definition), and transformations may result in zero or more result tree nodes (which may be represented by "null" or by the root of the fragment, but this represents additional complexity). This is, however, useful as a proof of concept. Also, extensions could be envisioned, whereby any number of template-result pairs beyond the first two could be added to signify transformation chains.
     More complex, and potentially more efficient solutions exist to preserve information about transformation processing. These solutions are typically geared towards specific applications, but some scale surprisingly well. One notable such solution, proposed by Chris Maden and Deborah Hooker of Yomu, is the creation of a new XPath extension function:
     
    result-node(URI-reference-to-stylesheet-node[, param-name, param-value]*)
    
     This leverages pre-existing implementations without significant overhead, without the complexities of defining a new XML application, and so forth. This solution is described in further detail at http://www.yomu.com/xtech2000 (URL subject to change).
     Detailed treatment of this and related methods for preserving transformation trace information will be omitted, but it is important to note the role of transformation tracing in the development of abstract processing models.
     

    First attempts at processing models

     Given the ability to trace documents across transforms (however we choose to implement it), thus solving the major problems mentioned, we have the basis for a simple processing model:
     
    1. Start with documentx0 which is known to be syntactically and semantically valid, with respect to the semantics of the application.
    2. Given documentxn , apply transformationtn to yieldxn + 1 .
    3. Verify that no syntax or semantics have broken, using theXSLT back-tracing described in section . Repeat transformation and checking for all transformationst0 ... k .
    4. Interpret the semantics present in the result document.
     XSLT 
     
    Although only a first attempt, this processing model has some obvious advantages. The most clear advantage is that the problem of semantic breakage has been entirely solved; we need only specify the semantics to be aware of during the checking phase, and each application's semantics may be specified independently of one another with no worries of unforeseen interactions.
     XSLT 
     
    Taking this basic model further, we notice thatXSLT isn't the only transformer of documents. XLinks are also capable of transforming documents to a degree. XLinks of type show="embed" may transform documents during processing as well. Whether this should truly be considered a transformation in a sense similar toXSLT is a question of much contention, but if desired, we can accommodate this:
     
    1. Start with valid documentx0 .
    2. Given documentxn , apply transformation or content embeddingtn to yieldxn + 1 .
    3. Verify that no syntax or semantics have broken, usingXSLT back-tracing, and check embedded content as well. Repeat transformation and checking for all transformations or embeddingst0 ... k .
    4. Interpret the semantics present in the result document.
     XSLT 
     
    This manages to close up some loopholes in our basic model, but opens up some further ones. Most importantly, in which order should processing take place? Should content embedding occur beforeXSLT transformations, or after? Should embedded content undergo processing in the context of the original document after it is embedded, or in its own context before embedding? The most sophisticated answer we currently have is:
     
    1. Start with valid documentx0 .
    2. Givenxn , identify allXLink s, and processXSLT transformationtn to yieldxn + 1 , maintaining the identity of the links across the transformation using back-tracing. Repeat for allt0 ... k .
    3. Identify survivingXLink s requesting embed or other processing of targets, and process each target in its own context.
    4. Interpret all semantics in all documents.
     XLink 
     XSLT 
     
    This is an incredibly robust solution for the combination of XSLT, XLinks, and any other XML applications not involving transformations; since it builds on the basic advantages of even our most naive abstract processing model with some useful knowledge about the specifics ofXLink . Paradoxically, it turns out to be a somewhat weak solution for combinations not falling into this convenient category. In the cases where other mechanisms are transforming documents, this model is still too ambiguous and does not suffice.
     XLink 
     XSLT 
     
    But what other mechanisms do we have presently to transform XML, besidesXSLT andXLink ? None, as far as XML applications go. But there are a growing number of software applications which let users directly or indirectly manipulate (effectively, transform) XML documents in real-time. And more XML applications could certainly be envisioned to do arbitrary transformations as well. Can a theoretical model be applied to this, the most general case?
     

    The general abstract processing model

     XLink 
     XSLT 
     
    We can modify the wording of our previous processing models to use more general terminology. Neither areXSLT transformations particularly special, nor those associated withXLink embedding, nor any other transformation. They all fall under the category ofgeneral transformations of semistructured data , and there is a limited number of basic transformations (create, move, delete) which may be composed to arbitrary complexity.
     Since general transformations can be expressed in terms of rules, in the same way as XSLT transformations are expressed as template nodes, the same source-rule-result triplet may be maintained in the general case, and the same back-tracing may occur.
     The general abstract model, then, is as follows:
     
    1. Start with valid documentx0 .
    2. Givenxn , execute transformation functiontn (x) to yieldxn + 1 .
    3. Verify that nothing has broken, using transformation back-tracing. Repeat for all transformation functionst0 ... k (x) .
    4. For any transformation functions which pull content from other documents besidesx , transform and verify that document until finished, and then embed that content in the main document as necessary.
    5. Interpret all semantics in all documents.
     XLink 
     XSLT 
     
    This model, with perhaps a few tweaks, is theoretically the strongest processing model possible given the semistructured data model. The trouble is, though, that it is now too general. There is ambiguity in processing order: for example, if bothXLink embedding andXSLT transformations are regarded as transformation functionst(x) , which of the two is supposed to be processed first? The prior processing models assumedXLink s would be processed afterXSLT , but is that a safe assumption?
     

    Problems and open questions

     The answer is, of course, no assumptions are safe. The model is provably sound, and it accomplishes the goals intended of an abstract processing model; but to assume anything about the workings of the model in order to make it work for one case, is to break it for ten other cases. It will likely be determined necessary to define a specification language for the processing of XML applications, which will fill in the blanks in the processing model on a case-by-case basis. The need for this sort of extensibility should be of no surprise to those who understand the need for extensibility in XML, but the specifics must be left to further research.
     To summarize the open questions about abstract models:
     
  • What is the best, or most general way to represent and process transformation reversal and back-tracing? (section )
  •  
  • What other types of common transformations would we like to perform on XML data? (section )
  •  
  • How do we clarify the ambiguities which arise from maximal generalization of our models?
  •  

    Processing model implementations

     One of the key principles to modern software engineering is the zen ofobject-oriented design . Object-oriented design stresses breaking programs into discrete objects, each of which is responsible for a set of operations and a set of data. These objects are additionally responsible forprotecting (even hiding) their data, so that neither malicious users nor careless programmers may misuse it. Objects may represent physical objects, conceptual objects, encapsulate algorithms, aggregate other objects, act as intermediaries between other objects, and so forth.
     We will discuss implementations of xml and its models strictly using the object-oriented paradigm, since practically all implementations are expected to be either in object-oriented programming languages, or in other languages using an object-oriented discipline.
     UML 
     

    UML conventions

     UML 
     
    For those unfamiliar withUML , a visual language for diagramming object-oriented systems, the following conventions are used:
     
    named boxes object classes (italics imply an abstract class)
     
    lines relations or associations between classes
     
    triangles inheritance relations (class with triangle pointing to it is the parent)
     
    diamonds composition relations (class with the diamond pointing to it is required for the composition of the class it relates to)
     
    arrows represent traversability (in association or composition relationships)
     
    numbers represent cardinality of objects (instances in association or composition relationships)
     

    Object model

     Infoset 
     
    For the sake of simplicity, we will assume a very straightforward object model, based on the 20 December 1999 working draft of theInfoset . The only slight complexity in the model is thePosition abstraction, which expresses the commonality between all information bearing the notion of position within a text string.
     
    Infoset -based object model
     Infoset 
     

    Basic implementations

     Given a simple object model for XML, we can now begin to create an object-oriented framework for processing that XML. Note that this framework is independent of the steps or ordering in the abstract models posed, but is based on the concepts in those models. The challenge in designing this framework, as with the challenge in all object-oriented design, is to pick objects which will represent the concepts appropriately, while still allowing for flexibility.
     The simplest framework we can imagine is one where the document is represented in one object, and the semantics and behavior in another:
     
    Basic implementation design
     Although the concepts have all been covered by the design, we can pick out some immediate weaknesses due to its simplicity:
     
  • Since the Document class has no information pertaining to the type of document it is, the XMLSemantics class has to do a lot of work. Namely, it needs to provide all the accessor/mutator methods for the data in the document hierarchy which has special meaning to the XML application, as well as code to handle the behavior of the document in the software, and so forth. The XMLSemantics class needs to be split up.
  •  
  • Also, since the Document has no conception of what types of data should be protected, or the effects of modifications to that data, the document hierarchy could easily be broken. A wrapper around the Document class should be provided to remedy this.
  •  We can combine these two issues into a cleaner, safer, second version of our processing model framework:
     
    Basic implementation design, revisited
     We have now split the semantics of an XML application into two classes, which we call XMLProcess (to signify the operations on a document when processing a particular application type) and XMLBehavior (which codifies the interactions between the XML application and the rest of the software). The XMLProcess class also serves to protect the Document from malice or accident. In fact, this basic model is sufficiently robust to handle most of the complexities present in the general abstract model described previously.
     

    Advanced implementations

     SVG 
     XHTML 
     
    The basic implementation we now have serves as an excellent abstraction for the concepts in the general processing model. We can attach one or more "processes" to a document (representing, for example,XHTML with embeddedSVG ), and may attach zero or more "behaviors" to each process (representing the semantics of an application, as defined in ). There are, however, inefficiencies in the model in some contexts.
     Whenever the XMLProcess class attempts to extract information from the document tree, it must either do exhaustive searching through portions of the tree, or must store references to salient features of the tree. Either way, thespace-time tradeoffs just amount to gross inefficiencies: either a real-time application will be slowed down enormously by the tree traversals, or the target system will be required to have large amounts of memory.
     Taking the concepts from our basic models and applying some simple design patterns yields better results. We will go through a three-step process to convert the framework from a document basis to a node basis:
     First, we would like all elements in our object model to inherit some core properties from a generic XMLObject :
     
    Inheritance of nodes from XMLObject class
     We now re-create the abstraction of our basic implementation, with our new XMLObject as the core:
     
    Replacing the Document with XMLObject
     Finally, since it is inefficient to store an overabundance of XMLProcess and XMLBehavior objects in memory, we split these objects intoflyweights (separating the algorithms from the context):
     
    Final efficient implementation
     A variety of techniques may now be used to provide safety and organization to the resultant structure; the simplest would be to provide a subclass of XMLProcess to attach to the Document node, as before, which would marshall the entire process.
     

    Notes on implementations

     Although the final implementation we developed for real-time processing is efficient, and conceptually relatively simple (in that it contains very few object classes, all of which are predictable and flexible); it is obviously a challenge to properly implement, especially in the presence of a chosen abstract model.
     Fortunately, in some cases, where real-time processing is not a concern, or efficiency is decided not to be a concern, we can gracefully choose the basic model (see ).
     The most important factor in designing the implementation is that it remains independent, and does not restrict the algorithm specified in the abstract processing model. The algorithm itself should be specified in one class (or an encapsulated group of classes) and should theoretically be swappable; this is a basic concept in object-oriented design.
     Lastly, we have one question left open from this section:
     
  • What is "efficient enough"? What lengths do we have to go to, and what pains must we take to accomplish this level of efficiency?
  •  

    Future work, and the mythical, extensible real-time model

     Up to this point we have discussed, in great length, all sorts of models surrounding XML processing. We introduced abstract models for processing of certain types of XML, an abstract model for processing any type of XML, a simple model for implementing these algorithms, as well as a complex implementation. We have discussed the challenges associated with generality, and those associated with real-time processing. But we haven't found the magic bullet. We still have holes in our models, and questions unanswered.
     Fortunately, we can learn from our own questions. The key question we may be able to draw insight from is: given the general model we posed in section , how do we solve its ambiguities? It is the author's belief that an XML application (or other language, depending on mood) should be developed to specify the parameters or "bindings" for the general model. This constitutes an extensible solution; and combined with our software implementations, the mythical "Extensible Model for Real-Time XML Processing" can be realized.
     W3C 
     
    This has interesting ramifications for the future of XML application and software development, as well as for the future of computing and communication. When such a model is finally implemented, it could stand as the core forall XML-based applications. TheW3C , the Mozilla project ( http://www.mozilla.org ), Yomu, and others have all recognized the intrigue in this idea: namely, with the development of this model, software and markup will be inextricably linked. And the new world of applications born of this convergence will change the computing field profoundly.
     Acknowledgements
     I would like to deeply thankBen Trafford for laying the groundwork for this research, enlightening me of it over Christmas dinner, and giving me the opportunity to write and present it; andChris Maden andDeborah Hooker for being so painfully smart.
     I would also like to thank professorsSteven J. DeRose andSteven P. Reiss for their profound answers to my shallow questions.
     I would finally like to acknowledgeJimi Hendrix , without whom this paper would never have been completed.

    Technical theory &, practice   Table of contents   Indexes   From markup to object model