| Information Management - Who gets the benefits | Table of contents | Indexes | Session chairs: | |||
Combining Architectures to Lower the Lifecycle Cost of Interactive Documents with Substantive Behaviours |
|
Steven R. Newcomb |
| President |
| TechnoTeacher, Inc. 3615 Tanner Lane Richardson Texas USA 75082-2618 Phone: +1 972 231 4098 Fax: +1 972 994 0087 Email: srn@techno.com Web: http://www.techno.com |
Biographical notice: |
Steven R. Newcomb, Ph.D. |
HyTime ![]() IETM ![]() IETP ![]() ISO 10744 ![]() Java ![]() MID ![]() Metafile for Interactive Documents ![]() SGML Extended Facilities ![]() XLL ![]() XML ![]() architecture ![]() content management ![]() inheritance ![]() simulation |
It is expensive to maintain long-lived documents that include behaviours that are part of their substance. By representing such documents using inherited architectures for program logic, etc., validation can be facilitated, reliability improved, and costs minimised. |
Introduction |
|
document management ![]() |
This paper proposes an approach whereby we may hope to extend SGML's "management umbrella" to expressions of interactive document behavior, thus allowing a more fully integrated approach to the problem of creating, maintaining and validating document sets that embody not only content, but also program logic. This approach exploits the inheritable information architecture paradigm described in the Architectural Form Definition Requirements (AFDR) section of the SGML Extended Facilities , as standardized in ISO/IEC 10744:1997 Annex A.3. |
Representing Substantive Behaviors |
|
As an example of document with substantive behavior, let's imagine an IETM for a steam engine aboard a naval vessel. |
The problem of specifying behaviors is the central problem of electronic documents, regardless of whether they are IETMs or telephone directories. There are several approaches. |
| browser plug-in querying software source code |
Another approach is to enhance an existing delivery system . This is a powerful answer to the problem of providing an electronic document with complex behaviors. It's a much less expensive approach than creating a custom delivery system "from scratch", and yet practically any behavior can be supported. One way to accomplish this is to provide a plug-in software module for each new complex behavior (or one module for all of them). The document can pass parameters to the software module that controls each complex behavior. Even though this approach is attractive, is it still far from ideal. If there are many complex behaviors, and some of them are used only once by the manual, it must be acknowledged that the software itself includes decision logic that is integral to the information represented and delivered by the manual. The behavior logic is therefore not really in the document; instead it has moved to the delivery system, where it now becomes difficult and expensive to maintain in harmony with the rest of the document. The substantive behavior-determining information cannot be validated by any ordinary querying process; once again, it can only be determined by running the manual while providing the system with various sensor inputs. The creation, maintenance and validation costs for such a manual are very high; the software and the manual that calls the software must be tested separately and together, and the skills and time of many people must be coordinated in order to accomplish such validation. It is especially worrisome that the software is written in a language which is not validatable in concert with its corresponding SGML content. Indeed, there may be no querying methodology that can validate all the possible interactions between the two classes of information. |
Java ![]() |
A similar, but slightly better approach is to provide special software with the document that the document "calls" via the delivery system . For example, many internet documents are supplied with Java programs that provide those documents with specialized behaviors when they are rendered. Because this approach acknowledges that the software is really part of a particular document, rather than pretending that the software is really part of the delivery system, this approach is conceptually cleaner. However, for our high-value steam engine manual, this approach still suffers from the same disadvantages as any other method of adding a class of behaviors to the delivery system: the behavioral information in the software is disjunct from the subject matter of the manual, and to whatever extent the added class of behaviors represents knowledge about how a steam plant should be operated, that knowledge cannot be maintained or validated by the same kinds of tools and procedures that are used to maintain and validate the rest of the manual. One way to find out the information that would only be revealed by the behavior of the applet is to run the applet with various inputs, and see what happens. Another way is to have a skilled programmer read the code. Both of these validation methods are human labor intensive. |
Java ![]() |
Since we have mentioned Java, it is trenchant to observe that Java is, of itself, no panacea for the problems faced by owners of documents with substantive behaviors. Java will change with the whims and competitive strategies of its vendors (viz. Microsoft vs. Sun); as a proprietary technology, Java is far from certain to be supported over a long-lived document's lifecycle; the tools we use to validate information in the SGML content world do not work on Java sources, and vice versa; the validity of Java applets for their associated text and graphic information is easily weakened and accidentally broken by the ongoing maintenance of such documents; and it is expensive to verify adequately that the integration of Java applets and their associated static information objects (such as SGML text) remains valid after each maintenance operation on the document. In the end, as far as content management is concerned, Java is just another programming language, albeit one with some special marketing advantages. (A methodology for handling Java code is proposed below.) |
HyTime ![]() hyperlinking ![]() transclusion |
A somewhat better approach is to use a delivery system which supports a certain fixed set of complex behaviors . The documents simply select and pass parameters into the routines that execute the behaviors. This was the idea behind the US military's MIL-D-87269 standard for IETMs. Because the number of complex behaviors is strictly limited, and because each complex behavior is simple to call and control, this approach works well. All of the substantive information in the manual is created and maintained in SGML. Powerful validation tools based on SGML and the SGML Extended Facilities can be used to validate the entire content of a manual, including the behavioral content. HyTime addressing, linking and transclusion features can be used to cut the cost of maintenance substantially, allowing new versions of IETMs to be recreated automatically, directly from databases of frequently-updated logistic information. Unfortunately, because the number of behaviors is limited, there can be no complex behaviors that are specialized for particular maintenance or operational situations, such as our steam plant. |
MID - A Radical Approach |
|
| cooking lessons |
A radical approach is to use a "reduced instruction set" document delivery system whose behaviors are specified in SGML by the document . In this model, the delivery system acts as an interpreter for the document. Effectively, the document is, among other things, a computer program, with programming constructs such as variables, conditional expressions, branching, subroutines, etc. The substantive behavior logic is entirely expressed in SGML, so it, along with the other content of the document, is subject to all of the maintenance and validation tools and other advantages of SGML. (Tools that support the SGML Extended Facilities can be especially useful when exploiting this approach.) All behaviors that bear on the subject matter of the document are described by the document in SGML. There is no need for "helper applets." The delivery system does just a few basic things, so it can be as useful for delivering cooking lessons as it is for delivering maintenance information for shipboard steam plants. Although some languages designed for computer-based instruction have offered an ability to mix content with behavior for over thirty years, this radical approach has been demonstrated only once using SGML, in the US Navy's "Metafile for Interactive Documents" (MID) architecture. The design of MID recognized the need to provide a single, consistent, unifying view of the contents of documents with arbitrary substantive behavior. That view was an SGML/HyTime view, since there is no other view that can bring all kinds of content, including program logic, under one standard validating syntax-umbrella. |
SHORTREF ![]() human nature learning syntaxes military conflict |
From the barrage of MID-opposing disinformation and nontechnical objections that have emanated from the Navy's weapons system vendors since MID was successfully demonstrated, we might reasonably conclude that, at least from the Navy's perspective, the reasons why MID has been shunned are not technical, but rather stem from fears that the widespread adoption of MID would bring unwelcome changes to their business models. That conclusion wouldn't be entirely fair, however, because:
|
| debugging programmers technical writers |
Grand Unification of Content Notations: The Perfect Approach |
XLL ![]() emergent properties independent link out-of-line link |
Once we are managing our "documents with substantive behavior" in this universalized and consistent environment, it becomes possible for the first time to run queries that can reveal all of the factors that will influence any particular behavior at any particular time. For example, it becomes possible to determine which behaviors will influence the presentation of which aspects of content. We can even determine which variables in the substantive behavior-defining software logic of our documents influence which aspects of the rendition of certain content, by means of complex queries that can treat everything found in all kinds of information assets as the contents of a single consistent object database. |
GroveMinder ![]() property set ![]() |
The technical basis for this "perfect approach" is the "grove" paradigm recently standardized in ISO/IEC 10744:1997, some of which was already in use in the DSSSL standard (ISO/IEC 10179:1996). Although few realize it yet, groves aren't just for SGML; they can be used to represent the output of parsers for any notation, and to provide a consistent API to information expressed in very different notations. (Those who are interested in this idea may wish to follow the ongoing efforts to bridge the remaining disjunctions between SGML and STEP data. They may also wish to contact TechnoTeacher, Inc, and ask about the forthcoming "GroveMinder" technology. ) |
An Approach For Today |
|
SGML Extended Facilities ![]() |
Given that the "perfect approach" is not yet technically supportable, what can be done with today's technology to better integrate content and behavior in documents, so as to minimize the cost of their upkeep? Are there any practical steps that will allow quality assurance processes to examine interactions between content and substantive behaviors? We can assume that program source code will be checked for internal consistency by compilers and other software-language-specific tools. We can also assume that content will be checked for internal consistency by SGML parsers, perhaps augmented with tools that exploit additional validation possibilities based on the SGML Extended Facilities . But the interactions between content and behavior are likely to be numerous and subtle, and such interactions will still resist testing by affordable automatic processes. |
Java ![]() |
We can hope to improve this situation by physically interleaving the behavior-defining code, in its own notation (Java or whatever) with the content that drives the behavior and/or whose presentation is defined by the behavior . Such interleaving offers two significant advantages:
|
Java ![]() |
But simply interleaving the code with the content is probably not, of itself, sufficiently beneficial to make it worth the trouble. It becomes worthwhile to develop all the necessary wrapper element types, and to develop the software that will assemble the portions of code in such a way that they can be executed, when we can do it in such a way that all that work can be re-used in (inherited by) whatever information architecture needs to incorporate that same programming language. This will make it practical and profitable to integrate high-reliability systems for creating and maintaining structurally specialized documents whose content includes substantive behaviors. |
The SGML Architectures paradigm set forth in Annex A.3 of ISO/IEC 10744:1997 can be used to combine various distinct information architectures by inheriting them in a single architecture, such as an architecture for documents with substantive behaviors. Since each inherited architecture can be supported by a modular, reusable engine, this technique can reduce the cost of developing software for particular complex integrated applications, such as MID browsers and authoring systems. |
For example, consider an IETM DTD that provides:
|
Nothing in this world is immune to change. However, because it is modular rather than monolithic, our IETM DTD and the software that supports it can cope with change gracefully and at minimal cost. When Microsoft changes its Windows metaphor or goes bankrupt, only the relevant aspects of our DTD needs to change, and the expense of upgrading the software to the next popular presentation metaphor is limited to the cost of bolting a new engine onto our authoring and delivery systems. The same is true for the other aspects: Java, the diagnostic harness, and XLL. |
Similarly, the productivity of sophisticated systems integrators will be greatly enhanced by the methodology of inheritable architectures. Essentially the same DTD and the same presentation and authoring systems can be created with minimal effort for documents designed to teach technical skills. All that normally needs to be replaced is the specialized content architecture, if any, and the software modules that support it. The rest can stay the same, unless perhaps a programming language other than Java, a presentation metaphor other than Microsoft Windows, or a hyperlinking architecture other than XLL is required. Even if different modules are required, it is possible that architectures and engines are already available to support them, too. And even if such architectures and engines are not already available, there is the possibility that once they are developed, they can be used profitably in other projects. |
Acknowledgments |
Bibliography
|
| Information Management - Who gets the benefits | Table of contents | Indexes | Session chairs: | |||