Authoring Tools and the Expanding Radius of Deployment   Table of contents   Indexes   Blood Sweat and Tears (Five years of practical experience applying XML/SGML to clinical information)

 

STEP and SGML/XML: what it means, how it works

 Nigel   Shaw
  Managing Director
  EuroSTEP Limited.  Castell, Bodfari,
Denbigh   United Kingdom  LL16 4HT
Phone: +44-(0)1745-710677
Fax: +44-(0)1745-710688
Email: nigel.shaw@eurostep.com Web: www.eurostep.se
 
Biographical notice:
 
Nigel Shaw is a consultant with the EuroSTEP Group and is the Managing Director for EuroSTEP Limited, the EuroSTEP Group company in the UK. The EuroSTEP Group provides consultancy services on STEP, CALS, SGML and data modelling. Other EuroSTEP companies are based in Sweden, Germany, Finland and USA.
 
Nigel has been at the centre of the development of STEP (ISO 10303) since 1986 and was the chairman of the STEP Editing Committee for ISO TC184/SC4 from 1988 to 1996. Nigel previously worked for Leeds University and British Aerospace and has an international reputation for his technical leadership on STEP. He represented British Aerospace in the PDES, Inc. consortium, working with companies such as Boeing, GM and Ford. Since early 1995, Nigel has moderated the vendor round table on STEP for the ProSTEP Association. With EuroSTEP, Nigel has been involved in various projects including the NATO CALS Data Model dealing with logistics, continuing with the moderation of ProSTEP vendor Round Table and work for Boeing on the integration of engineering analysis and related documentation.
 
Nigel is joint leader of the SGML and Industrial Data Preliminary Work Item under ISO TC184/SC4.
 Dallas 
 ISOGEN International 
 Kimber, W. Eliot 
 U.S.A. 
 

Nigel is married with three children and lives halfway up a mountain in Wales.
 W. Eliot   Kimber
  Senior Consulting SGML Engineer
  ISOGEN International  2200 N. Lamar St., Suite 230
Dallas   Texas  U.S.A.  75202
Phone: +1 214-953-0004
Fax: +1 214-953-3152
Email: eliot@isogen.com Web: www.isogen.com
 
Biographical notice:
 
W. Eliot Kimber has been involved with generalized markup, SGML, electronic publishing, and hypertext for all of his career, mostly at IBM, more recently for Passage Systems and ISOGEN International. Eliot is co-editor (with Charles F. Goldfarb, Steve Newcomb, and Peter Newcomb) of the HyTime standard and a member of ISO/IEC JTC1/SC34, the ISO committee responsible for SGML and its related standards. Eliot was a founding member of the XML Working Group. Eliot is the author of the soon-to-be published book Practical Hypermedia: An Introduction to HyTime, part of the C.F. Goldfarb Series on Open Information Management. When he is not working on, writing about, or teaching about standards, Eliot works as a systems integrator, helping clients use SGML, HyTime, DSSSL, and related standards to their best effect. In his spare time, Eliot is a devoted husband and dog owner.
 
ABSTRACT:
 
The STEP standard, ISO 10303, is the primary standard for data representation and interchange in the manufacturing and process world. Originally designed to enable the interchange of 3-D CAD models between different systems, like SGML, it has evolved into a more general mechanism for representing and managing complex data of any type. There has long been a requirement for STEP-based data (parts and processes) to interact and interoperate with SGML-based data (technical manuals, requirements documents, etc.) but this requirement was never satisfied, for various reasons. With the advent of groves, SGML's formalism for describing data abstractions, it became possible to harmonize the STEP and SGML worlds by defining the mapping between their respective data abstractions and defining complementary formalisms, EXPRESS in the STEP world, property sets in the SGML world. If such a mapping can be defined and used productively, it will allow the two worlds to communicate with each other and to see the other in familiar terms. In particular, it allows STEP-based technology to treat documents (and anything else represented as a grove) as though it were STEP data as well as allowing grove-based tools to treat STEP-based data as though it were represented as groves. This allows the technologies and techniques of the two worlds, which are entirely complementary, to be used together to best effect while avoiding data redundancy. Both standards will become stronger as a result. This paper presents the basic technical issues, the design approach being used to realize this harmonization, and the potential benefits the harmonization will provide.
 

Introduction

 
The STEP standard, ISO 10303, is the primary standard for data representation and interchange in the product design and manufacturing world. Originally designed to enable the interchange of 3-D CAD models between different systems, like SGML, it has defined and uses a general mechanism for representing and managing complex data of any type. Increasingly products are defined as solid models that are stored in product databases. These databases are not limited to shape but contain a considerable wealth of other information, such as materials, failure modes, task descriptions, product related meta-data such as approvals and much more.
 
The product world is of course also replete with documents, from requirements through specifications to user manuals. These documents both act as input to the product development processes and are output as well. Indeed in some cases documents form part of the product and are given part numbers. It is therefore not surprising to find that there are many companies where there are very real requirements to interact and interoperate between the product data and documents, specifically in the form of SGML-based data.
 
This paper reports on work in progress to bring the two worlds together. This is primarily being done through the SGML and Industrial Data Preliminary Work Item under ISO TC184/SC4. The need for common capabilities for using STEP and SGML together has been obvious for a long time as can be seen from the inclusion of product data and SGML-based data within initiatives such as CALS. However, until recently, this requirement was never satisfied, for various reasons.
 
For the last year or more, a small group has been actively pursuing this area and gaining the necessary understandings across the different standards. It is this work that is reported here. The basic thrust of the work is to answer the questions: Can STEP and SGML be used together and, if so, how?
 

Basic Technical Issues

 
There are some major similarities between STEP and SGML, particularly in terms of their basic computer science. Regrettably, that fact probably just gives rise to increased subtlety and complexity in terms of the details. This has meant there has been some effort consumed simply in reaching a sufficiently high level of understanding in order to define the issues and start building the necessary bridges between the two worlds.
 
The basic issues, expressed in non-technical terms, are as follows:
 
  1.  How to enable SGML to be stored (and managed) within product databases? Such storage has to be in a form that makes sense from both perspectives and also is general enough to cope with all kinds of encoded information, not just that wrapped in SGML.
  2.  How to enable addressing from documents into product data? This needs to deal with the fact that the STEP development did not properly understand the need for robust addressing of individual data items when it established the basic approach used. The initial emphasis was on the ability to exchange a coherent and cohesive set of product data from one system to another. This ignored the desire to point into such a set of data from outside (where the outside could be another product data set or an SGML encoded data set).
  3.  How to hold links from product data to documents?
    •  Links to whole documents (as storage objects): >
    •  Links to elements within a document>
  4.  How to use SGML/XML as a mechanism to exchange product data? It is becoming clear that XML is going to be the de facto standard for transfer of information on the web. STEP has considerable investment in an approach that aims to separate the definition of the data from how it is stored and/or presented. It is therefore natural to look at how to use an XML encoding for EXPRESS-driven data.
 
The design challenge has been to see if these issues and requirements can be met simply through proper use of the available set of standards. The tools at our disposal include SGML, XML, HyTime and the STEP definition language, EXPRESS.
 
While there are some interesting problems, such as STEP's ill-defined (but highly flexible from a business perspective) concept of a storage object, the process of simply applying the standards, rather than inventing new mechanisms, has so far proved adequate.
 

Design Approach

 
Simply stated, the main design approach has been to use pre-existing facilities as far as possible and in a consistent across all the different objectives. This has lead to the requirements given earlier being addressed through the following approaches:
  •  The use of HyTime's grove abstraction and related addressing facilities to provide the same basic mechanisms between product data and documents (and even from product data to product data).
  •  The use of EXPRESS to capture the form of the information such that multiple information repository technologies can be applied (not just SGML's single view).
  •  The use of XML to provide a transfer syntax.
 
Essentially this can be seen as following the old adage of using horses for courses. We are adopting the strengths of each standard and applying them accordingly. Regrettably, of course life is not that simple. When you consider that, in pursuing this design approach, we have created an EXPRESS model of the HyTime property set for EXPRESS driven data and will store instances of links defined using HyTime against the property set in EXPRESS-driven databases with file transfer between them using XML, it becomes clear that there is at least some potential for confusion. (In that light, there are two accompanying papers with this one that go into more of the specifics of the models and XML capabilities.)
 
As can be inferred from the previous paragraph, there is the potential here for what can be seen as basic research and a very academic view of the problem. However we are confident that these facilities can be applied to give very substantial gain. Therefore we next look at the potential business benefits.
 

Potential Benefits

 
It will undoubtedly take a while before the necessary learning is in place as to how best to use STEP and SGML together. There will be many ways in which the capabilities can be applied. The scenarios given below aim to show some of the benefits.
  1.  Seamless navigation across SGML and EXPRESS-driven data
     It is clear that for many kinds of information, the user does not care how it is stored, providing it is well presented and can be searched and navigated effectively. The fact that information is held, for example, as a product structure with both geometry and descriptions of tasks attached in a database, rather than as an SGML encoded manual should not be apparent to the user.
     However, from a business perspective it is essential to consider the need to maintain configuration control over the information as first the product design changes and then actual products are maintained. This naturally leads to using a database, potentially enabling tracking of configurations through time. But it may still make sense to have task descriptions encoded in SGML. The first aspect that is a potential benefit from the STEP and SGML work is to enable their storage together in a standard, robust and archivable fashion.
     The user would like to be able to jump from a text description of a part called out as a spare or as supporting equipment in a task description directly to the details of the part. This is a typical example of a link from document to product data. The second benefit of the STEP and SGML work is to enable that link to be defined in a standard way that will be portable across enterprises and across systems. (The porting of the data set will also be enabled by the XML encoding of EXPRESS-driven data.)
  2.  Requirements traceability
     Most product designs are driven by some kind of specification document. A good example of this kind of document would be the use studies passed from government to industry when procuring weapon systems. Clearly the content of this document is reflected in the design. However there is usually very limited ability to ask such questions as: if I change this aspect of the design, which requirements will be affected? What is needed is better requirements traceability.
     By enabling links back into the requirements documentation from within the detailed design data, it will be possible to maintain such traceability. Given that it will be effectively a HyTime link, it should also be possible to move from the requirements to the corresponding design aspects.
     This kind of functionality is not new and does require considerable discipline to fully realise. However the STEP and SGML work should enable it to be fully based on standards, resulting in greater portability and safer long-term archive. It should greatly facilitate those areas of business, such as defence, where the customer and product provider wish to work closely together to define the product.
  3.  Annotation of product data
     The file format defined from STEP represents a reasonably efficient mapping from a data set to an ASCII encoding, where the mapping is a consequence of the related EXPRESS schema. (There are strong parallels in this area between an EXPRESS schema and a DTD.) However the file format exhibits several weaknesses which are obvious from an SGML perspective:
    •  There is no standard mechanism for sending the EXPRESS schema with the data that corresponds to it.
    •  The structure is not well suited to the extension that becomes appropriate when the EXPRESS schema changes.
    •  There is only a simple commenting facility without any structuring capability.
    •  Meta data is restricted to a single section that applies to the whole information set.
     To pick just one of these, it is obvious that during the ability to annotate at a level of individual data items in a flexible manner would have advantages. This is one of several aspects where an XML encoding for EXPRESS-driven data will bring additional benefits. It should also, of course, make such data far more accessible through web tools and technologies.
 

Conclusions

 
This paper skims the surface of the work that has been done and is still on going in bringing together STEP and SGML. It deliberately leaves the real details to the two accompanying papers. In that sense the paper is setting the scene and providing a non-technical view of progress made.
 
It is clear that the potential benefits of enabling the worlds of STEP and SGML to interact and co-exist are high. The progress made so far has been in effect basic research and that has concluded that there are no major barriers to realising the benefits. We are well on the way to the time when the key decisions taken on using product data and documentation together will be made purely on business grounds and not because the standards cannot be used together. The STEP and SGML work is showing that potentially the benefits of the sum of the two areas is greater than their parts.
 
Bibliography
ISO 10303-11:1995 Product data representation and exchange - EXPRESS language reference manual
ISO 10303-21:1995 Product data representation and exchange - Clear text encoding of the exchange structure

Authoring Tools and the Expanding Radius of Deployment   Table of contents   Indexes   Blood Sweat and Tears (Five years of practical experience applying XML/SGML to clinical information)