| Graphics-based Product Documentation: Principles and an Application | Table of contents | Indexes | HL7-XML Progress Report | |||
Common Business Library (CBL) |
| Terry Allen |
| Commerce One
Email: tallen@sonic.net |
Biographical notice: |
Terry Allen is a specialist in technical standards that support complex electronic publishing applications, including information discovery and retrieval, metadata, and internationalization. He is a codesigner of the Docbook DTD, the SGML application most commonly used for computer documentation (and good for other things, too!). He has participated in IETF and W3C working and special interest groups on HTML, URLs, URNs, MIMESGML, WEBDAV, XML, and the OCLC Metadata group. He designed and edited the first Web portal site (Global Network Navigator's Whole Internet Catalogue). Since 1997 he has been working on document design and architecture for electronic commerce systems. He is currently chairman of the OASIS Registry and Repository Technical Committee. For further details seehttp://www.sonic.net/~tallen/ |
ABSTRACT: |
Copyright 1999 by Commerce One, Inc. |
Copyright notice is not to be removed! |
Why CBL? |
Design Goals |
Starting Points |
My first step was to scour the Web for e-commerce standards and specifications. There is no lack of them, but a year ago August there was very little in XML or in a format lending itself to XML ification. I found a number of obvious standards for the basis of an e-commerce specifications, such as ISO 8601 for date and time, and ISO 4217 for language codes. I also found the BSR (Basic Semantic Repository) , which contains a partial union set of semantics from X12 and EDIFACT. From the BSR I extracted primitives for such things as addresses. (You can find the BSR online at.)http://www.iso.ch/BSR/ |
I then examined specifications for sets of e-commerce documents, such as OBI (Open Buying on the Internet) , and in concert with my colleagues at Veo, devised a set of document types that would support both the construction of online trading communities and the scenarios of such specifications as OBI . I aligned relevant document types with semantic primitives defined by Rosetta Net (for catalog content) and the IOTP (Internet Open Trading Protocol) (for payment). Then I worked through as many business semantics models as I could to see if my document types were sufficiently robust to do real work, and tested them by constructing sample documents to support various specifications such as OBI . |
CBL Architecture |
From the standpoint of DTD design, CBL 's DTD-syntax representation is traditional: it has an information pool composed of modules, some of which rely upon each other, and a set of modules that define the contents of document types; the document types themselves are meant to be containers that can be discarded when building larger document types. |
As part of CBL I |
|
<market.participant.info.pointer> <urn.reference urn.string="urn:x-commerceone:identity:henry.morgan"> </urn.reference> </market.participant.info.pointer> |
|
E-commerce requirements not dealt with in CBL or dealt with only minimally include: |
|
Semantic Domains |
XML markup, in association with its documentation, gives meaning to a document's content. (Markup is often falsely called “self-describing metadata,” but of course the description is in the markup documentation, not the XML document.) For e-commerce, this meaning covers three basic domains: |
Common to all these domains are at least some from among the general datatypes describing such things as time, space, number, and physical properties. Aside from that commonality, these domains are largely disjunct, but software that processes e-commerce information must deal with them jointly: |
In CBL I have dealt with these domains in different degrees: |
|
Transition to SOX |
About the time I began to have things worked out, Murray Maloney, Alex Milowski, and Matthew Fuchs began to develop what became SOX , and I spent many hours converting my CBL DTDs into SOX schemas, initially in a mechanistic fashion, later taking into account SOX's mechanisms of inheritance and extension. By version 1.2 of CBL I was writing SOX schemas first and thinking in SOX ; the DTDs became secondary products. This development led to considerable discussion within the company about how to use SOX effectively—and my education in the alternate universe of object-oriented programmers. But it is from SOX source (more properly, source conforming to a rearranged subset of SOX ) that our programmers have been working for the past year. |
Going forward, we intend to use the XML schema language that the W3C will specify. |
CBL Lessons Learned |
The Obvious |
We can't hand-craft everything . It will be necessary to generate huge wads of SOX from existing specifications. Optimization probably must be done by algorithm or not at all, rather than in Terry's wetware. |
Naming is contentious . As you all know. Our Java programmers complained unendingly about my naming syntax (lower case with periods as separators). I have since discovered a multitude of naming styles in existing e-commerce specifications that are similarly unlike Java conventions. I'm afraid programmers will have to live with syntaxes they don't like; models of multiple names associated with multiple contexts, such as ISO 11179, may be useful for relieving this tension. For example, in IEEE P1489, the Standard for Data Dictionaries for Intelligent Transportation Systems, one find a name of the form CONSTRUCTION.ROAD_TargetCompletion_date
. In an ISO 11179 schema one might represent this as the name of the data element and add |
<synonymous.name context="Java">constructionRoadTargetCompletionDate </synonymous.name> |
I did not try to avoid qualified names for element types and attributes, but I did try to avoid the flattened names of EDI . For example, where CBL has an address
element EDI
has compounds such as Goods.DeliveryLocation.Address
, which collapse containing context with the names of elements that ought to be reuseable. These collapsed constructs are not unlike queries, but they have no place in an XML schema. |
Heritage |
EDI transaction sets are useful . At least some of them, anyway. They represent information sets people actually want to exchange, and can be transformed into document types by deleting unneeded information (such as trailer fields) and separating out the semantics of individual documents from those describing batches of documents.. Rosetta Net is developing information interchange models that includes stubs for what could be document types (Rosetta Net is current filling in these stubs with EDI transaction sets), and document types developed by the Open Applications Group look like a rationalized version of EDI's. There is little point in reinventing these information sets, although new ones are required for describing markets, their workings, and their participants. |
Certain EDI basic semantics are useful, but BSR is not the solution because of the collapsed context problem I mentioned earlier. X12 and EDIFACT badly need reengineering and rationalization, so as to sort out reuseable primitives (Address) from contextual semantics (CustomerAddress). And above all their archaic syntax has to be junked. |
OO XML |
Multiple inheritance is essential . Corky (who exists only for the purpose of this example) is a swine, a sow, a mother, aSus scrofas domesticus , a Gloucester Old Spot, a pet, a thing licensed by Sonoma County, a patient of Dr. Clive N. Huff, the object of an insurance policy, and an input into the manufacturing process that will produce this fall's supply of pork. There is no way to arrange the matrix of Corky's attributes in a single tree. The best we can do is construct a SOX representation of a swine schema that uses inheritance along the axis most useful to our immediate needs, and represent information along the other axes by XML containment and pointers to taxonomies (taxonomies are very important). |
SOX -based e-commerce schemas require constructs not present in EDI . For example, in CBL 1.2 I have both a simple.line.item
and an unpriced.line.item
(for those cases such as a request for bid, in which the price is unknown and should not appear). For SOX
I've created quite a few prototypes (38 at last count) to support inheritance, which EDI
knows nothing of. |
A lot can be done with datatypes . Most enumerations can be reduced to datatypes. The ability to specialize datatypes and provide constraints on acceptable values is very powerful. |
The Not Obvious |
You can be too abstract . At one time I modelled a general transaction description document type. It helped me a lot in realizing what a transaction is (the exchange of value between entities, perhaps many such exchanges) but it was too abstract to be useful in the real world. So I broke it down into purchase order, invoice, request for bid, response to request for bid, which is the level of abstraction found in EDI and the level that people find comfortable. |
Semantic mapping is essential . To use semantics already defined elsewhere, it is necessary to be able to point to them, a facility added to SOX in its latest revision. It is worth noting that ISO 11179, which deals with the specification of data elements and the organization and operation of a data element registry, has facilities for doing semantic mapping (and has considerable intellectual overlap with SOX ). Whether this mapping should be done within the XML schema or from outside, in an independent document, depends, I think, on whether one is trying to reuse well known definitions of semantics in a new schema or construct mappings among existing schemas (which may be read-only). |
Business logic must be expressed . To construct efficient schemas for business documents it is necessary to know how the information they encode is to be processed—or at least what sets of information will be processed together. I've considered it out of scope for CBL , but I'm beginning to wonder if I'm right about that. |
Process logic must be expressed . I may send you a purchase order for you to fulfill it, for you to notarized it, or for you to archive it. I can't express that intent within the document, which I may want to digitally sign and use unchanged for all three purposes. I found I couldn't reasonably express processes in CBL ; something along the lines of UML (Unified Modeling Language) is needed and I now just point out to a hypothetical process description. I'm pretty sure this is out of scope for CBL , but it's needed for a complete e-commerce system. |
Both registries and repositories are essential . ISO 11179 defines a registry, which can be seen as an interface to a repository. There is some variability in what these entities are called by different people, but the distinction is between metadata (the information held in the registry) and the data (the DTDs or XML schemas themselves). For XML to succeed on the Web we need a means of serving DTDs on demand. To enable sane development of XML schemas, we need to enable reuse of XML schema source. In both cases a repository is required: for XML on the Web, it is what an XML client would access (perhaps directly, without going through a registry) to resolve a DOCTYPE declaration. For an XML development environment, a registry is essential to permit an overview of what exists already and can be reused, and the repository behind the registry is essential for managing authoritative source (I managed the registry part of this problem in wetware during CBL development, but even inside my head that approach doesn't scale). |
Beyond CBL |
CBL was an interested exercise as a prototype, and allowed me to develop solutions to many XML architecture problems. But from a practical point of view, its semantics are insufficiently integrated with those already used (in a various and confusing ways) in commerce. |
We intend to respecify CBL in the ECO Framework Project forum (described at), embracing as much of the semantics of EDI as possible. To do this we must determine what pieces of EDI semantics are actually used and used consistently, we must create uniform XML representations of EDI's various code sets (and the code sets it relies on, such the list of all the world's airport codes), and we must rebuild EDI's innards. As I remarked earlier, EDI's document types are useful—so they are a starting point for top-down design. And the atomic data elements are useful—so they are a starting point for bottom-up design. In the middle there's the problem of what comes in the middle. Some of EDI's “segments” are probably useful; larger structures common across document types need to be defined. And, as I discovered in the course of conversion to SOX , one needs prototypes to provide a common basis for structures that share some contents. A respecified CBL will be much richer in both prototypes and large structures than EDI —representing XML's facilities for reusing information.http://www.commerce.net/projects/currentprojects/eco/wg/ |
| Graphics-based Product Documentation: Principles and an Application | Table of contents | Indexes | HL7-XML Progress Report | |||