| High Level Architectures of Document-Object Publishing Systems | Table of contents | Indexes | Strategies for implementing SGML/XML as a glue layer in engineering processes | |||
Cost-Effective EDI Using XML? |
|
Philippe Vijghen |
| Project Manager |
| SGML Technologies Group ACSE sa/nv 29 Boulevard General Wahis B-1030 Brussels Belgium Phone: +32 2 705 70 21 Fax: +32 2 705 81 01 Email: phv@sgmltech.com Web: http://www.sgmltech.com |
Biographical notice: |
Philippe Vijghen |
He obtained a degree, specializing in electromechanical engineering, at the Free University of Brussels (ULB). He may be contacted at phv@sgmltech.com. |
ABSTRACT: |
This paper proposes the use of a pivot format when developing EDI applications, based on the experience of three operational projects. |
The role of SGML/XML, as pivot, is presented in a broader context, with regard to other relevant candidates for structuring data. |
Introduction |
|
More and more companies, together with their partners, are moving towards EDI (Electronic Data Interchange) . Traditional EDI, however, has the reputation of being inflexible and expensive. |
The first section gives an overview of the various syntaxes that have been defined in the past for exchanging data. |
A pivot-oriented approach, and its benefits, are explained in the following section. |
In the final section, three EDI projects are presented, where XML has been used as a pivot. |
Various Syntaxes for Exchanging Structured Data |
|
A brief look at these syntaxes |
|
Although they are all aimed at defining a structure below the level of file granularity, it is clear that all of these different specifications and syntaxes target different needs in terms of application environments. |
If the purely technical aspects of the various syntaxes are examined, however, it must be concluded that SGML/XML allows for more complex modelling. |
|
This very straightforward comparison only takes into account the syntax and the modelling flexibility, of course. Although this comparison is irrelevant when the difference in application that is targeted by each of them individually is considered, the comparison is invaluable when considering the best candidate for representing pivot models. |
Indeed, this paper addresses the need for consistent use of a pivot format when developing EDI applications. With a pivot format in mind, the use of the most flexible and scalable syntax is fundamental; there, the choice is purely technical and is independent of the external representation format that may be required by the users in function of the application field. |
A Pivot-Oriented Approach |
|
When developing EDI applications, one of the key tasks consists in implementing filters for processing the messages to be exchanged. Such filters aim either at converting the messages from one representation to another, or at using the information contained in the message in databases or other external applications. |
Our experience demonstrated that using a pivot-oriented approach for developing such filters proved to be extremely cost-effective. The approach consists of using a single internal representation of the information, for the implementation of all the filters applied on EDI messages. |
Note that the word ‘pivot’ here has a different meaning from that in the traditional EDI terminology, where it often designates more restrictively the representation used for loading and exporting messages to and from databases. |
The cost effectiveness of such an approach is justified by the following facts: |
|
We, the SGML Technologies Group, have chosen to use SGML/XML and an integrated development environment based on this technology as the cornerstone for many projects based on a pivot-oriented approach, including those in the field of EDI. |
XML-based EDI applications |
|
Experience gained during the development of various EDI applications includes: |
|
G-EDI |
|
The G-EDI project, aiming at processing the telecommunication bills of a major banking institution in Belgium, initiated the development of a generic EDIFACT parser. The key point is that the parser was based on the notion of a pivot format. In practice, the implementation is based on SGML technology. |
Although the SGML tags and syntax did not help as such for this implementation, the generic coding mechanisms that are part of SGML helped to keep the application independent from the actual syntax of the message. Indeed, XML offers all the possibilities that are required for modelling the information contained in EDIFACT, as it allows for the encoding of the documents with regard to arbitrary complex tree models and, if hyperlinks are considered, even graph models. |
Although the actual EDIFACT syntax of the messages that are transmitted by the telecommunications company changed four times since the system was put into production, only the mapping to the generic underlying model had to be reviewed. Owing to the approach, it has been possible to reduce the application maintenance costs by a factor of five. |
CLASET |
|
The goal of the CLASET (Classification Information Set Message) project, developed in the context of the European Programme for the Interchange of Data between Administration (IDA), is rather ambitious. Take the example of the definition of ‘secondary school’ in the various European countries. The reader will understand that there is no consistency at all. But the European Institutions still want to produce accurate statistics on such matters, across the internal borders. In order to achieve this, complex nomenclatures for statistics must be defined. CLASET includes the definition messages for exchanging such nomenclatures. |
The CLASET message allows for the exchange of any hierarchical structure, such as nomenclatures or classifications, and has conceptually been defined as a result of a Merise model. |
Different representations are used, each based on a distinct syntax: |
The EDIFACT representation is the most official representation of the information transported by CLASET messages, because of the standardization process. |
The SGML representation is recommended to people who are dealing with highly structured text, because the SGML representation offers more flexibility than EDIFACT for operations such as the attachment of footnotes or presentation styles to words that are part of the free text. Such structures can easily be encoded using an SGML mixed content group model, which has no equivalent in the EDIFACT layering of messages, groups, segments, and data. |
The HTML representation is read-only: it enables people to view the messages using a tool such as a Web browser (used as a local viewer application in this case). |
The actual implementation of the CLASET project is based on SGML, used as an internal pivot. This approach has lead to an application code that is independent of the actual details of the representation syntaxes and has been a major argument for the cost-effectiveness of the project. The benefit of such an approach is not due to the SGML syntax itself, but the tools associated with SGML offer the required flexibility and the relevant processing features. |
EDIDOC |
|
The EDIDOC (Electronic Data Interchange for Documents) project covers the design, implementation, and deployment of a flexible framework for document-oriented EDI at the European Space Agency (ESA). |
The system is in charge of document exchanges occurring in several distinct applications of the agency: |
At the heart of the EDIDOC system, a central server acts as a clearing house, giving a potential legal value to the documents exchanged by logging them into a robust relational database. |
This server integrates, in a very generic and flexible way, the key concepts needed for electronic document exchanges: |
At each of these levels the server makes sure that the documents are delivered in accordance with the preferences of the recipients: in the right format, with the right security package, and the right communication protocols. It really plays the role of a gateway. |
The EDIDOC generic envelopes have been defined in XML. They include the details of the exchange: originator, list of recipients, unique reference, subject, time stamps, document types and formats, security mechanisms, delivery options, groupware context, remote management options, error messages, and so on. |
The filters that are plugged in the EDIDOC ‘document standards’ components are based on the notion of a pivot format for conversions. Although the use of SGML is not enforced, it is the best candidate for defining custom pivot formats for structured documents. Indeed, SGML includes a very consistent and generic way to model the information. Moreover, the use of SGML as a pivot format can help for the actual implementation of the converters because some of the SGML features, such as OMITAG, SHORTREF, LINK, and CONCUR, can be used for the actual implementation of the convertors themselves. |
EDIDOC has demonstrated how important and cost-effective it can be to have a system that uses a pivot format at the heart of an implementation, even when the format is not being exposed to users or external systems. In the context of EDIDOC, this `pivot' paradigm was applied not only to the messages themselves but also to all the surroundings services (communications, security, and workflow), owing to an object oriented approach. This has given provision for reusability, scalability, and customizing. |
Conclusions |
|
This paper demonstrates that where using a common pivot format, SGML/XML is invaluable for the development and integration of EDI applications. It was illustrated by some operational experiences in the context of the G-EDI, CLASET, and EDIDOC projects. |
Bibliography |
| High Level Architectures of Document-Object Publishing Systems | Table of contents | Indexes | Strategies for implementing SGML/XML as a glue layer in engineering processes | |||