Groves For Applications

Groves For Applications

Graham  Moore

Programmer Analyst
Database Publishing Systems Ltd

608 Delta Business Park  Swindon  UK
SN5 7XF Phone:+44 01793 512515 Fax:+44 01793 512516 Email:

Biographical notice

Graham Moore is a Programmer/Analyst who joined DPSL after graduating from Southampton University with a 1st Class Honours Degree in Computer Science. Graham is currently designing and implementing architectures to support the creation and delivery of Web based IETMs.

Groves for Applications



This paper considers the existing use of groves and suggests that there is a missing application of this technology. The missing class is concerned with representing applications, programs with functional intent, and the states within applications as grove models. This paper presents the problems and requirements for representing applications as groves and what it means to link to a node in a grove.

Groves and Existing Usage

Groves are the graph representation of a set of nodes and their relationships. Groves typically manifest themselves as 'In memory' object models.

Groves are the abstract model for a given notation, for example, with the SGML property set it is possible to create a grove for an SGML instance. Each notation, such as SGML, is processed by a notation processor, a notation processor 'knows' how to construct a grove from a given instance.

Once a notation is in the form of a grove it can be used as a resource in a general linking model such as HyTime (ISO 10744) or XLink (W3C Working Draft 19980303) and in transformation processes such as DSSSL (ISO 10179).

The benefit of this approach is to allow standard processes, such as linking, to work seamlessly with a variety of different notations.

A missing class of the use of groves

While the behaviours above (Linking and transformation) are general mechanisms specific to no particular notation it has till now been the case that the grove paradigm has only been applied to data structure notations, e.g. SGML, CGM, Microsoft Word documents and Microsoft Excel documents. However, it is the position of this paper that a missing use of groves is the representation of identifiable states within a computer program.

Investigation into representing applications using the grove abstraction and the ability to link to application nodes has yielded several significant results:

  1. States within applications can be represented using the grove paradigm.

  2. There is a method for the construction of groves for notations that do not have an automated notation processor

  3. There exists an answer to the question 'What does it mean to traverse to a node'.

  4. Using extended linking mechanisms it is possible to traverse to a given application state.

  5. Using the grove paradigm the above is applicable in a distributed environment given that a canonical representation of a grove is available.

Paper Outline

The following sections detail the aims and objectives behind building groves for applications in a distributed environment, how it is possible to do such a thing and what benefits can come out of it.

Aims and Objectives


This section details the motivation and objectives that led to the development of Groves for Applications (GFA).

Application Integration and Communication

The driving issue behind the development of GFA is the desire to effortlessly integrate applications in such a way that from a given application it is a trivial task to 'link to' or 'invoke' a known state within another application.

An extension to the above aim is that applications should be addressable in a distributed environment, and not constrained to only working on a single machine.

A refinement to the first aim is that the 'trivial' task of 'invoking' an application state should be realised by utilising existing linking mechanisms, such as XLink.

Groves For Applications

Solution Overview

This section is concerned with the approach taken in providing an implementation of groves for applications. The solution is comprised of five key aspects.

  1. The definition of the property set for applications.

  2. The process for constructing the grove.

  3. The extension of a standard link engine, which rather than returning a target node invokes a method from a generic link interface.

  4. The functional binding of application specific logic with grove nodes to handle linking messages.

  5. The use of a standard link syntax for navigating to a node.

Property Sets for Applications

The role of the property set is to define which aspects of a thing are 'first class'. In the case of SGML the property set defines elements and attributes as being some of the first class objects in an SGML grove. The problem at hand requires that a property set for applications be defined. It can be seen that the property set for SGML is large and exhaustive. To define an exhaustive property set for applications is beyond the scope of this paper. This paper sets out to define a property set that makes the states of an application first class objects within the system. It is considered, at this stage, that the act of making the states and the application first class is more powerful and useful than attempting to create an exhaustive property set.

When deciding the property set for application groves the original aims were consulted, which were to allow linking and integration with application states using standard linking mechanisms. The generic property of an application that we have defined as being a first class object is the 'application state'. In many applications the possible states are not defined, where this is the case the state objects are emergent properties. In some cases the available states are clearly defined and these can be considered explicit properties. Regardless of whether the application states are explicitly defined or not the grove paradigm brings all applications to the same abstract plane.

The following property set definition provides a mechanism for defining application states as being first class.

 \011[TMLOOM-LT]classdef rcsnm="application"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="name"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="proxymixin"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="states"
 \011\011\011\011\011\011 appnm="application state"
 \011\011\011\011\011\011 datatype="nmndlist"
 \011\011\011\011\011\011 noderel="subnode"
 \011\011\011\011\011\011 ac="state"
 \011\011\011\011\011\011 acnmprop="id"[TMLOO-GT]
 \011[TMLOOM-LT]classdef rcsnm="state" appnm="an identifiable state within an application"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="id"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="name"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="traversalObject"
 \011\011\011\011\011\011 datatype="node"
 \011\011\011\011\011\011 ac="traversalStruture"
 \011\011\011\011\011\011 noderel="subnode"[TMLOO-GT]
 \011[TMLOOM-LT]classdef rcsnm="traversalStructure"
                appnm="A structure that piggy backs when a traversal
                'arrives' at a node"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="type"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]
 \011\011[TMLOOM-LT]propdef rcsnm="description"
 \011\011\011\011\011\011 datatype="string"[TMLOO-GT]

The above property set definition declares that each application can have a number of state nodes. Each state node in turn has a traversal object associated with it. The traversal object is discussed in more depth later, for now it is sufficient to know that the traversal object is a mechanism for passing information to a particular application state upon traversal. The traversal object definition has a data type of 'string' and should be regarded as typed, yet serialised, data.

The most important aspect of the application property set is not defining an exhaustive set of properties but in making the states within the application first class.

How does the grove get built?

Groves are the abstract representation of an underlying notation and the in-memory realisation is constructed using a notation processor, for example the grove for an SGML instance is built by the SGML notation processor. This piece of software is aware of how to interrogate the underlying data structure and from it build an in-memory grove model that adheres to the SGML property set.

As the states within applications are less well defined than a Microsoft Word document or a CGM image it is not possible to construct a notation processor for all applications. There are certain classes of application for which it is possible to build a grove constructor. The characteristics of these are that they should be modular, self-documenting and allow a degree of introspection. An example of this kind of application would be an application constructed of Java beans.

In the general case, where an application notation processor is not available, it is suggested that application developers are responsible for writing the canonical form for the grove that represents their application.

The canonical grove representation, as defined in HyTime 2nd Edition Clause A.4.5, allows for the construction of an out of memory grove. In the case described, the application developer becomes the notation processor, the 'in-memory' grove is in the application developers head and the canonical form of that grove is stored permanently in a file. Once a canonical form of the grove exists it can easily be restored to being an in memory grove, accessible and malleable like any other. This is possible by the fact that a canonical representation can be restored to an in-memory object model by a generic grove builder.

What does it mean to traverse to a node?

The goal of groves for applications is that it should be possible to represent states within an application and that when a link is traversed to a given node the functionality associated with the state described is executed. The mechanism to 'invoke' such functionality is not clearly defined and this is one of the issues that this paper addresses.

We considered groves representing data structures and the way they are manipulated by link engines. Typically, a node in the grove is located by the engine and then returned to the application that initiated the traversal. It became apparent that it was possible to extend this default engine behaviour with a standard protocol. By introducing a protocol the act of traversal is made more accessible and more functional.

GFA requires this additional functionality due to the fact that applications are non-standard. For each application state within an application grove it is necessary to write some application specific code. The objective of this code is to talk to the application in order that the state traversed to is realised within the context of the application. This code is invoked upon traversing to the node.

In the same way that the application developer is tasked with defining the property set for a given application it is also their responsibility to write the thin code layer that will get invoked as a result of traversing to a node.

Defined below is an interface for use in traversal within any linking model. This interface is called the ITraverse interface and is defined in both Python and Java.


 public interface ITraverse {
   Node TraverseTo(String Tobj);


 class ITraverse:
   def TraverseTo(Tobj)

An extended link engine will invoke the TraverseTo() method on the grove node instead of simply returning the node to the application that initiated the traversal. It is therefore necessary that any grove implementation can be dynamically extended to respond to the TraverseTo() method. More specifically it must be possible to dynamically bind the code written by the application developer for a given application state with the grove node that represents that state.

The nodes in a grove already have an API and while the API has not been standardised it is hard to see where any large discrepancies could exist. This implementation of GFA has taken to using PyGrove, a grove implementation developed by Paul Prescod, as a base grove implementation. Each node has a generic node interface, by a generic node interface we mean the methods and attributes that are valid for any grove node. Methods and attributes of the interface consist of things such 'GroveRoot', 'SubNodePropertyNames' and 'getSubNode(nodeName)'.

Using the Python language it is possible to define generic delegation structures. These allow the basic grove node class to be dynamically extended in a non-intrusive manner. Using this mechanism it is possible to take the extended functionality of a node and combine it with the existing node protocol.

In addition, the single parameter, the traversal object, is a powerful mechanism used to piggyback information on the act of traversal. Application states are likely to need parameters sent to them in order to configure the particular state. The object passed through is defined in terms of a 'typed and serialised' data structure.

The ability to dynamically associate arbitrary functionality with a given node provides a mechanism for binding the application developer's code with generic grove functionality. A separate file contains information that the generic grove builder uses in order to associate functionality with nodes in the grove.

While it is suggested that the application developer write the canonical form of the application grove and write the code associated with each application state this can be done by anyone who knows how to get the underlying application to do what is required for each state.

We have shown what it means to traverse to a node and introduced a generic mechanism to make the act of traversal more accessible and functional. We have used this degree of functionality to allow a thin layer of application specific code to be executed. This code is responsible for communicating with the underlying application in respect to the particular state. This thin layer is written by the application developer and dynamically associated with the appropriate node by the generic grove builder.

The GFA Extended Link Engine

The extended link engine is the key component in GFA. It is responsible for resolving the reference to the canonical form of the application grove, building the grove into an in-memory object model, binding in the application specific functional layer and invoking the TraverseTo() method. In reality the task of constructing the grove and performing the functional binding is delegated to a sub component, the 'generic grove builder'.

Because applications may reside in different locations and because in different cases the location of invoking the TraverseTo() method may vary, the extended link engine should be considered as a peer component. This means that it could be integrated with an application, be part of a local proxy HTTP server, be invoked locally from an application, run as a service on a given port or be combined as part of an HTTP server. Having the flexibility to use the engine in each of these roles enables GFA to be used in a distributed fashion.

Linking to an Application

One of the key aims of groves for applications was the desire to use a standard linking mechanism such as HyTime or XLink to navigate to a state within an application. To illustrate the mechanism we have used XLinks, XLinks are powerful enough for the job at hand, yet also represent a subset of the HyTime linking mechanism. Thus it follows that this mechanism is valid in a HyTime environment.

The basic mechanism for linking to a node in an application uses an XLink to reference the canonical form of the grove and an XPointer to reference a given node within the grove. The example below shows linking to the canonical representation of the grove for notepad.

[TMLOOM-LT]nplink xml:link="simple"
  fninvoke="local" href=http://localhost/notepadGrove.gfa#id(newFile)[TMLOO-GT]
  Link to notepad[TMLOOM-LT]/nplink[TMLOO-GT]

The above example links to the 'newFile' state within the 'notepad' application. Because the potential exists to process the link on either the client or the server we need to make some indication as to where this should happen. This attribute is comparable to use of the '#' or '|' in XLink to mean that processing should occur on the client or that no preference exists.

In this case the use of the GFA extended link engine could be as a proxy server running on the local machine or perhaps it is called from the application presenting the link.

The next example illustrates traversing to a state within notepad with a file already opened. The state requires parameterising, the parameter being the name of the file to open. Below is the XLink structure required to traverse to the node and parameterise the state.

 [TMLOOM-LT]link ID="linknp2" travType="withObject" xml:link="simple"
  Link To Notepad Grove Load File State[TMLOOM-LT]/link[TMLOO-GT]

The above examples show how it is possible using GFA and standard linking syntax to link to a state within an application. While the application used above, notepad, is trivial, it is clear that the same mechanism can be used with any application.

Examples And Benefits

Example Application

Below are a few examples of GFA. It is believed that the diversity, even within a small sample of applications, illustrates the power and the ability of GFA to be applied to a number of problems.

  1. Using the existing linking mechanisms available in a wide variety of applications to link to states within associated but separate applications. This requires the authoring of new links but in no way would require any changes to any of the existing applications.

  2. Representing a transaction processing application using GFA where the traversal object is the transaction to be updated, rolled back, archived etc, where the application states are roll back, commit etc. This kind of use of GFA regards states as being transitory nodes within a given application, but none the less can be represented and utilised using GFA.

  3. Development using the GFA model. This approach would utilise the traversal object such that the traversal structure would consist of links back to nodes in the invoking application. This models a language that has an asynchronous callback mechanism.

Some benefits

It is believed that GFA provides a powerful mechanism for integration and communication, outlined below are some of the more important benefits.

  1. Using GFA, it is possible to link to an application state using standard linking mechanisms.

  2. GFA makes it possible to link to states within applications as opposed to simply starting an application.

  3. GFA enables application developers to integrate with other applications without needing to know how to make intricate sets of API calls.

  4. GFA enables legacy systems with basic linking facilities to seamlessly integrate with other applications, for example the ability to integrate technical documentation with an illustrated parts catalogue or online ordering system.

  5. The canonical and serialised form of the grove acts as a description of services offered by an application, making it possible to browse available services before invoking them.

  6. The ITraverse interface allows developers to further extend the functionality and scope of the linking paradigm.

A uniform model for distributed processing?

This idea is alluded to in an above example but is worthy of a mention of its own. It is believed that using a simple traversal mechanism combined with the traversal object and the canonical representation of a grove it is possible to build highly sophisticated distributed applications. It will be the goal of further research to fully explore this avenue.


This paper has :

  1. Shown that states within applications can be represented using the grove paradigm via property sets.

  2. Provided a method for the construction of groves for notations that do not have an automated notation processor

  3. Provided an answer to the question 'What does it mean to traverse to a node'.

  4. Defined and implemented an extended linking mechanisms that makes it possible to traverse to a given application state.

  5. Shown how the grove paradigm is applicable in a distributed environment given that a canonical representation of a grove is available.

This paper has illustrated the power of groves within a new domain. It illustrates how a simple protocol and a well-defined architecture, the linking architecture, can be used to build powerful and integrated solutions. The success of this work should prompt one to consider other domains in which the grove paradigm could be applied.

  • ISO 10744 HyTime 2nd Edition
  • XLink Working Draft 19980303