Implementing a Link Editor   Table of contents   Indexes   XML and Electronic Commerce: But What About Documents?

 
 

Regulations Worldwide Online at the Siemens Public Communication Networks Group


, SGML Editorial system for providing company-internal regulations in the Intranet
 
Christian   Märtin
  Industrial Engineer Manager
  Siemens AG, Public Communication Networks
Organization, Information, Logistic Hofmannstr. 51
 D-81359 Munich   Germany
Phone: +49 89 722-48197
Fax: +49 89 722-34851
Email: christian.maertin@oen.siemens.de
 
Biographical notice:
 
Christian Märtin
 Germany  
Krüger, Jürgen
 Munich  
 Siemens Nixdorf Business Services 
 

Christian Märtin studied mechanical engineering and industrial engineering. Until 1972 he was System Engineer and Project Manager in development projects at Motoren- und Triebwerks-Union ( MTU ), Munich. Since 1972 Mr. Märtin has been in charge of projects in the information systems area at Siemens AG , Munich. In collaboration with the Siemens Local Companies he set up a number of documentation centers in Latin America and since 1985 he has been doing foundation work in the area of structured information processing. Mr. Märtin is head of the Documentation and Information Systems department in the Public Communication Networks Group.
 
Jürgen   Krüger
  Business Consultant
  Siemens Nixdorf Business Services
Structured Document Processing Carl-Wery-Str. 22
 D-81739 Munich   Germany
Phone: +49 89 9221-3251
Fax: +49 89 9221-3290
Email: juergen.krueger@mch.sni.de
 
Biographical notice:
 
Jürgen Krüger
 
Jürgen Krüger is a graduate in mathematics. From 1981 to 1992 he was Project Manager and Main Designer with responsibility for developing WYSIWYG  (What You See Is What You Get) editors at Berthold AG , Berlin. 1992 Mr. Krüger moved to Siemens AG as a Technical Assessor. Today Mr. Krüger is a business Consultant responsible for Document Management Systems and archives in the area of Structured Document Processing .
 Germany 
Hack, Franz
 Munich 
 Siemens Nixdorf Business Services 
 

Franz   Hack
  Business Manager
  Siemens Nixdorf Business Services
Structured Document Processing Carl-Wery-Str. 22
 D-81739 Munich   Germany
Phone: +49 89 9221-3239
Fax: +49 89 9221-3290
Email: franz.hack@mch.sni.de
 
Biographical notice:
 
Dr. Franz Hack
Siemens Nixdorf Business Services
 

Dr. Franz Hack is a mathematics graduate. After a number of years spent in research and teaching at the Universities of Duisburg and Bochum, he has been working since 1984 in the area of Information Engineering within Siemens AG , Munich. Dr. Hack is in charge of the business field SGML/XML at Siemens Nixdorf Business Services , which deals with consultancy, development and training in this area.
 
ABSTRACT:
 
Adherence to standard interfaces, use of standard products and automated sequences determine the SGML editorial techniques used in the Siemens Public Communication Networks Group. This procedure ensures a large measure of product-independence and high-levels of scalability of technology and methodology. The editorial outlay for implementing additional features in the electronic documents is fully compensated for by the availability of support tools, automated processes for media-specific data editing and uniformity in document management. This means that both the productivity in creation and the quality of the documents is increased. Last but not least standardization safeguards simple entry into future online technologies.
 
This article describes the task definition, the methodical concept and the technical implementation in project RWO  (Regulations Worldwide Online) .
 
 

Task definition

 
Global task definition

 
 
A large number of decentralized editorial offices (>20) create or revise around 30,000 pages per year in around 3,000 documents for what are known as the “Official Announcements”. This typically includes documents such as circulars, guidelines, work instructions and organization plans. After the documents are released they must be made available to all employees in the Public Communication Networks Group. This involves around 20,000 people all over the world.
 
The need to publish these documents locally led in the past, despite administrative requirements covering structure and appearance, to unsatisfactory results: Non-uniform corporate identity, conflicting content, no linkage between the information, restricted options for further processing and lack of granularity in the documents, particularly for identifying modified information.
 
 

Premises underlying the solution

 
At the end of 1995 the Organization, Information and Logistics area of the Public Communication Networks Group created a standardization concept for uniform provision of “Official Announcements”. The following premises had to be taken into account:
  1. Decentralized and platform-independent processing options
  2. Use of standard interfaces and formats, as DIN  (Deutsches Institut für Normung) and ISO  (International Organization for Standardization)
  3. Use of standard products
  4. Large measure of independence from the tools used
  5. Ensuring a uniform structure and layout
  6. Media-neutral provision of information (one source of information for a number of publishing media, e.g. paper and online)
  7. Uniform document administration
  8. Retention of tried-and-tested editorial procedures
 
 
 

Methodical concept

Electronic document
 Extensible Markup Language 
Media-neutral document provision
 XML 
 

Principle of media-neutral document provision

 
 SGML 
Standard Generalized Markup Language
 

Restriction to a simple basic alphabet as defined by the ISO , and the use of SGML  (Standard Generalized Markup Language) allows platform independence, decentral processing, uniformity in content and structure as well as algorithmic translation of this data into media-specific versions. To minimize the switch in the way of working in the editorial offices, the documents are created as before in a first stage of implementation in print format layout. These documents are administered in a uniform document administration using their SGML document instance sets. Editor-specific export of these SGML document instance sets for further processing in the editorial offices ensures that the original layout is restored. Workflows support the editorial offices to generate media-specific versions.
 
 

Technical implementation

 
Overall technical system (product-oriented representation of the core system)

 
 DTD, Document Type Definition 
 Document Type Definition 
 

Basis for technical implementation is a DTD  (Document Type Definition) oriented to the requirements of the application. Adherence to this document structure is the only requirement placed on the editorial office. The choice of a suitable SGML editor is at the discretion of the individual editorial offices. The SGML document instance sets are stored uniformly in the Astoria object-oriented database written by Chrystal Software . The printed edition is produced using FrameMaker+SGML written by Adobe . A bijective mapping between the DTD and FrameMaker+SGML structures which are described in an EDD  (Element Definition Document) and supplied with suitable layouts, guarantees the reproduction of the layouted data from the SGML document instance sets. The working environment under FrameMaker+SGML is supported centrally by a technology group and provides the editorial offices with a convenient SGML editorial system.
Astoria database
EDD
Element Definition Document
FrameMaker+SGML
SGML database
 

The interconnection between FrameMaker+SGML and the Astoria database is made with FrameBridge from Adobe and via a DocGroupBridge provided using the standard program interface. This gives editors a uniform interface for their data administration as well.
 
 

FrameMaker+SGML working environment

 
The FrameMaker+SGML working environment includes the SGML application for SGML export and SGML import provided by the standard functionality as well as an API  (Application Program Interface) which provides convenient support for document processing. This ProdClient delivers the following functionality:
  1. Central and local document storage ( Private / Public directories)
  2. Menu-driven input of organization data on template pages
  3. Provision of document-dependent templates
  4. Automatic import of structured indexes
  5. Book-in-book functionality
  6. Generation of online versions by importing suitable layouts
  7. Generation of complete versions in PDF  (Portable Document Format)
 PDF 
Portable Document Format
 
 
Organizationally ProdClient is based on three levels of processing:
  1. Document , consisting of files
  2. Document group , consisting of files and documents
  3. Edition , consisting of files and document groups
 
 
This organizational regulation allows logical division and administration of editions in the editorial environment. In addition it offers automated navigation for online provision of the documents.
 
Subsequent sections of this document describe the system-supported editorial processes of ProdClient :
  1. Creation of documents and print output
  2. Generation of the online versions in PDF format
 
 
 

Creating the documents and printed version

 
Typical scenario in the FrameMaker+SGML working environment

 
Corporate publishing
 

Corporate publishing necessitates security mechanisms in document processing. This is delivered by ProdClient in Private / Public processing:
 
In the corporate network all document directories (first processing level) are available in a Public directory as FrameMaker+SGML documents. The editors borrow document directories for processing in their Private directory. Borrowed directories are locked for other users. Only the editor processing it is authorized to write this document directory back. The documents are processed using FrameMaker+SGML as structured documents in the WYSIWYG mode of the paper version.
 
 

Online output in PDF format

 
Online output in PDF format

 
 
PDF provides a fast, cost effective and uncomplicated way of providing documents in the network. All document links created in FrameMaker+SGML, such as cross-references, hypertext links and active indexes are retained when the PDF viewer Acrobat Reader from Adobe is used. Plug-Ins allow the Acrobat Reader to be activated from generally-used Web browsers.
 
The explicit use of layouts which is brought about by the document structures allows automation for adapting the display to particular screens (unlike print format layout). Depending on the document type, ProdClient controls the import of suitable templates, tables scaled to paragraph layout width and supplies link navigation buttons using the interrelation between documentdocument groupedition .
 
Provision of PDF documents on UNIX-based Web servers requires the hypertext links to be checked because of the case sensitivity of the file names. The PDF Link Tool corrects the notation of the links and file names and checks that the target documents exist. An error-free run ensures that the edition is valid as regards hypertext links.
 
PDF editions are produced from the Public directory.
 
 

Astoria SGML database

 
The Astoria object-oriented database administers the SGML document instance sets created using FrameMaker+SGML. All structure constructs are retained as objects. This natural granulation of the documents allows such functions as the localization of modified information units. The standard functionality of Astoria includes:
  • Version checking
  • Reproduction of time based validity statuses
  • Re-use mechanisms
  • Search algorithms
  •  
    FrameBridge provides a standard method of interconnecting FrameMaker+SGML and Astoria. Structure constructs, e.g. individual elements, can be loaned out and processed.
     
    DocGroupBridge supports the editor’s specific method of working through check-in/check-out of documents , document groups and editions between the Public directory and the database. In addition, this API automates add-on procedures:
  • Support of editorial processes ( Private , Public )
  • Link validation
  • Version administration by establishing defined, valid editions
  • Generation of output in HTML  (Hypertext Markup Language)
  • Configuration of new documents
  •  
     

    DocGroupBridge

     
    DocGroupBridge regulates interchange between documents , document groups and editions in FrameMaker+SGML format from the Public directory and the corresponding SGML document instance sets in the Astoria database. To this end processing levels documentdocument groupedition are stored in a meta structure in the Astoria database. DocGroupBridge places the SGML document instance sets on the corresponding meta structure nodes. The editor navigates under the FrameMaker+SGML user interface using the document structure which is familiar to him.
     
    In addition DocGroupBridge offers version administration of SGML editions and creates HTML data in an automated workflow from versioned SGML editions.
     
     

    Exchange of documents, document groups and editions

     
    SGML export and import

     
     
    With SGML export from FrameMaker+SGML, a selected directory of a processing level ( document , document group or edition ) in the Public directory is exported into a temporary SGML directory. During this export process FrameMaker+SGML transforms the FrameMaker+SGML documents into SGML document instance sets. Existing graphics are converted according to the rules of the SGML application and provided as separate files. DocGroupBridge logs any export problems that occur and displays information about them. If the export proceeds without problems, DocGroupBridge imports the SGML document instance sets into the Astoria database, deletes the SGML directory and the FrameMaker+SGML documents in the Public directory.
     
    With SGML-Import the editor creates a document , a document group or edition from the Astoria meta structure. DocGroupBridge checks out all associated SGML document instance sets from the database in a temporary SGML directory and then imports these into the Public directory. During the import process the SGML document instance sets are changed to FrameMaker+SGML documents. Graphics are integrated according to the rules of the SGML application into the FrameMaker+SGML document. All actions are logged and displayed.
     
     

    Version administration of SGML editions and generation of HTML outputs

    HTML generation
     HTML, Hypertext Markup Language 
    Hypertext Markup Language
    cross translation
     

    Versioning and HTML generation

     
     
    With the release of an edition the editor initiates the process of version administration for editions. This involves first checking that all destinations for the hypertext links exist. If destinations are not available the invalid links are listed in a log file and the process is ended. The edition is not stored as a new version until its integrity has been verified.
     
    Versioned SGML editions can be transformed into HTML outputs. This is done by creating a copy in an SGML buffer. Balise , a product written by AIS Berger-Levrault then uses predefined, DTD specific scripts to transform all the SGML document instance sets to HTML .
     
    In addition the database compares the documents of the last two versions as regards changes in the date of output. If different data are identified, the document should be transferred into a news list. This is done by exporting these files into temporary directory and transferring them to Balise. Balise extracts change information from the SGML document instance sets and creates an HTML news list from this.
     
     

    Mapping principles for HTML generation

     
    A special Balise script has been developed for the types of document under consideration. The script is called up via DocGroupBridge .
     
    The result of the transformation is an information system of HTML documents. The transformation takes place without any information being lost. The “structural poverty” of HTML is compensated for by using frame sets: Different HTML files on one subject are visualized in parallel. This interpretation of HTML data makes it necessary to generate and to integrate logical units from independent HTML files in each case.

     
    Visualization of the information units in frame sets and representation of the interactions between the information units.

     
     
     

    Project sequence

     
    Working packages and milestones

     

    Implementing a Link Editor   Table of contents   Indexes   XML and Electronic Commerce: But What About Documents?