Stylesheet Driven SGML Transformation   Table of contents   Indexes   Microdocs, Birthrights, and Pottage Messes

 
 

Pragmatic SGML-solutions in a telecommunications organization


,  Versioning of translations and integration of several graphic formats in a SGML-system used for technical documentation
 
Frank   Peetoom
  Interim Manager
  GEA Interim Management bv
Stationsplein 28
Weesp   The Netherlands  1382 AD
Phone: 00 31 294 415 280
Fax: 00 31 294 415 071
Email: f.peetoom@gea.ml
 
Biographical notice:
 
Dr. Frank J Peetoom
 
Training
 
Doctoral Physics (specialization Informatics)
 
Propaedeutics Psychology
 
Experience
 
1978-1979 Software developer at Rijksuniversiteit Leiden.
 
1979-1980 University lecturer Physics.
 
1980-1982 Independent entrepreneur with educative robot as product.
 
1982-1984 Software developer of media planning package at market investigation bureau InterView.
 
1984-1988 Co-founder of market investigation bureau Motivaction (30 employees): project leader and technical manager.
 
1988-1989 Course manager at hardware/software supplier Publishing Technologies: responsible for courses in the field of DeskTop Publishing.
 
1989-1990 Electronic Publishing consultant at software company Informaat in the field of multimedia.
 
1991-1992 Organization adviser at GEA adviesgroep bv: specialisation Prepress, Electronic Publishing and printing offices.
 
1993-now Interim manager at GEA Interim management bv.
 
Employment experience
 
  • Management and change management at Prepress department of Stork (1993-1994)
  • Management of Prepress department HD 1874 (1995)
  • Manager department Prepress/Integration preparing departments - implementation of SGML system at printing/publishing company Koninklijke Vermande (1995-1996)
  • Change management introduction of SGML system at Philips PBC (1997)
 
Publications
Hilversum
Philips Business Communication Systems
 The Netherlands  
van der Does, Cees
 

Books in the field of automation and Desktop Publishing (by now 10 titles).
 
Cees   van der Does
  Manager Training, Documentation and User Interface Design
  Philips Business Communication Systems
PO Box 32
Hilversum   The Netherlands  1200 JD
Phone: 00 31 35 689 1535
Fax: 00 31 35 689 1055
Email: c.vanderdoes@pbc.be.phileps.com
 
Biographical notice:
 
Cees van der Does
 
Manager Training, Documentation and User Interface Design at Philips Business Communications.
 
After his study Electronics, Cees van der Does started in 1984 as hardware development engineer at Philips. After four years, he switched over to become a business engineer. As business engineer he gave advises in the area of costprices, cashflow- and grossmargin analyses. Beside that he was strongly involved in process improvement projects and change management advises.
 
The last four years Cees van der Does is working as a department manager. After setting up new ways of working, designing new processes and giving employees new and better suitable roles, the department is not working only on technical training and (paper) service manuals anymore. There are now also training for end-users, sales supporters and sales personnel. Product information is distributed on paper (printing on demand), CD-ROM, Intranet and on-line help. The department is now also responsible for designing and maintaining the PBC intranet sites and designing the user-interfaces of PBC products.
 
ABSTRACT:
 
At the end of 1996, Philips Business Communications started a project to implement an SGML-system for technical documentation. GEA interim management participated in the project as consultant. Philips Business Communications has been using the SGML-based system since the end of 1997. With this system, technical manuals are generated for different media: folio, CD-rom and the Philips Intranet. At this moment 100.000 pages are stored in the system. These pages are edited by 22 authors/instructors.
 
When implementing the SGML-system, Philips faced two problems for which there appeared to be no ready-made solutions:
 
  • Integrating pictures, in various formats, and with the help of different drawing tools (CorelDraw! and AutoCAD) in the SGML-system.
  • Managing the different versions of the manuals in 12 languages.
 
This contribution describes the process through which Philips has arrived at the present implementation and the eventual design of the system. All decisions during this process have been made by finding a balance between the technical feasibility, the process practicability and the cost efficiency of the solution.
 
 

Survey of the SGML-system

 
The system was built in one year, based on Information Manager from Texcel. Information Manager roughly consists of two parts: a database with the SGML-material and a Workflow Management System.
 
Philips decided to use as much as possible standard, mainstream software. The authors work with ArborText and CorelDraw! Complex drawings are made in AutoCAD by the draughtsman, who also use CorelDraw! for simple drawings.
 
On the output side, FrameMaker+SGML functions as a batch composer. The managers and the task co-ordinators of the department can run various reports to monitor the progress of projects.

 
Overview Philips System

 
 
The image material in the Philips manuals varies from complex exploded views that are made in AutoCAD, to simple block diagrams that are made with CorelDraw! Besides there is the material from existing manuals, that is scanned in as TIFF files. The manuals are published in 12 languages. This means that each picture exists in up to 12 different language versions.
 
 graphics 
 

Wishful thinking about graphics

 
When you embark on a large scale SGML project like this you have an image in your mind of how parts of the new system should work. You try to have an open mind. You would like authors who work with SGML-texts and in-text graphics to be able to work intuitively with graphics. Two major requirements are then:
 
  • A seamless integration of text and graphics. The authors must be able to view the graphics at full resolution in the text, and must be able to edit the graphics. Ideally, authors would be able to edit the graphics without the need to fire up a separate graphics package.
  • For technical documentation you would like to use a library function for graphics and the reusable elements inside graphics (for Philips: telephones, computers, switches). In the ideal case you would have the SGML-editor in one window and thumbnails of graphics (or elements of graphics) in another window. You could then drag and drop graphics into your SGML-document and the system would take care of all the underlying stuff with entities, metadata etc.
 
Lets look how these ideals survive in the harsh world of SGML.
 
 

Seamless integration of text and graphics

 
When we confronted our shortlist of SGML-suppliers with the specifications for integration of text and graphics, they scratched behind their heads and mumbled: it can of course be done. Sounds familiar to you? It can be done, but you can't buy it, you must let a supplier build it for you. Despite the fact that a lot of organizations have the same wishes concerning graphics! It is an amazing world.
 scripts  
 

The handling of the graphics is done by a number of custom Perl scripts. I will not explain the exact working of all 27 scripts with 200 Kb of code in three different languages: RMS, TCL and CMD. A substantial part of these scripts is dealing with graphics and versioning of translations. This illustrates that the functionality you want from a SGML system is not build in, you always must add it. I hope this changes in future.
 metadata 
 

These scripts rely heavily on metadata. For example the format (.CDR, .DWG, .EPS or .TIFF) of the drawing is one field of the metadata. Another metadata field describes the status of the drawing, "final" or "draft".
 
What emerges after a considerable effort is a complex interwoven set of scripts and metadata to do the things you want the system to do. You can do it all, though we had to skip various parts of the functional description, notably:
 
  • The drawings can't be seen in ArborText (not even in low resolution) You have to double click the corresponding tag (which lists the name of the corresponding file) to launch the graphics application (CorelDraw! or AutoCAD) and view the drawings. Legacy data (TIFF scans) are visible in ArborText.
  • It is not possible to have the same image in several languages to share the same image data, but pointing to different texts. So if there is a change in a drawing, we have to edit that same drawing for all languages concerned (12 maximum)
 
 

Library function for graphics

 
I can be very brief regarding this functionality. It is very complex to build a graphic library function, not to mention the possibility to drag and drop graphics. We dropped all of this functionality at the very start of the project. Of course, in CorelDraw! and AutoCAD, we keep libraries of images and icons.
 
 

A pragmatic implementation

 
Sometimes you win, sometimes you lose. With graphics you lose more then you win, but we believe we built a workable system within the limitations of the present state of SGML. We will not dig into the messy details, I will just give an overview of the different processes.
 
TIFF
 

Legacy graphics (TIFF scans)

 
Legacy data (6000 drawings) were scanned in TIFF at 600 dpi (the resolution of the output via DocuTech) by a conversion service. All TIFF scans are stored in a separate part of the database (not surprisingly called legacy).
 
In the converted manuals, all necessary tagging to refer to these drawings was also done by the conversion service. The scanned graphics are visible in ArborText, but cannot be edited. To edit the graphics, either a new CorelDraw! or AutoCAD drawing must be made. The author or task coordinator makes a request to the draughtsman. The draughtsman creates the required drawing in CorelDraw! or AutoCAD. The authors link the resulting drawing into their document and remove the link to the TIFF-scan.
 
When a manual is published, FrameMaker imports the necessary TIFF-scans. We had a minor problem because in the resulting anchored frame in FrameMaker a border of white space is inserted around the scan and this caused parts of the scan to be invisible. We forced the whitespace to zero in the rules file of FrameMaker.
 
CorelDraw!
 

CorelDraw! graphics

 
Philips had first selected CorelFlow! for simple in-text graphics. But FrameMaker is not able to import CorelFlow! graphics. Furthermore, it proved to be very complicated to launch CorelFlow! from within ArborText. So we decided to use CorelDraw!, which is more expensive and more difficult for the users then CorelFlow!
 
The authors make CorelDraw! graphics, which are later checked by the draughtsman, following Philips guidelines. The creator of the drawing is coded in the filename for the drawing.
 
The lifecycle of a CorelDraw! graphic is as follows:
 
  • The author creates an empty drawing from within ArborText. He/she inserts a graphic tag and supplies a name for the graphic. The name contains the initials of the author and a sequence number for this author. Then the system launches CorelDraw1 and the new drawing can be made.
  • When the drawing is saved, CorelDraw! is closed and the drawing is saved in a separate part of the database. The metadata fields are set: status = "draft", etc...
  • The draughtsman runs on a regular basis a query over all CorelDraw! drawings to filter the "draft" drawings. These drawings are checked according to Philips guidelines and after approval get the status "final".
 
AutoCAD
 

AutoCAD graphics

 
Because the authors do not work with AutoCAD, the process for AutoCAD is rather straightforward. The draughtsman creates AutoCAD drawings, which are linked into the SGML-document by the authors. All AutoCAD drawings reside in a separate part of the database.
FrameMaker
 

A complication with the image material was the use of FrameMaker+SGML on the output side. FrameMaker cannot directly process AutoCAD pictures, so besides the AutoCAD (.DWG) version of a picture there must always be an EPS version available, especially generated for FrameMaker.
 
Through all these file formats a complex environment has emerged, in which a scheme of meta data, scripts, FrameMaker files (EDD, rules) controls the processing of image material within the SGML system.
 
translations
versioning
 

Integrated version control for translations

 
Everyone who has to deal with large amounts of documents knows that version control is hard to manage. Version control in one language is difficult, but if you have to deal with 12 languages it starts to be quite complex. Controlling versions by hand requires a lot of discipline, a lot of administration and a lot good faith in the people who try to control this process.
 
All SGML-based databases have more or less extensive facilities for function control. Controlling all the translated version of an original set of data is no standard feature.
 
At Philips PBC we designed the following solution, where the total version control is managed by the SGML database management system. We chose the manual as our basis set of data to be controlled. Our original authoring versions are the English versions. The system declares the version number in the meta data of a certain manual. After every change restored in the database the English manual version number is raised by one.

 
Version Control Translations

 
 
To translate a manual for the first time, we export it from of the database and send it in SGML format to the translator. The translator replaces the English text with the required language. In our example the language German language. The SGML structure of the translated version is then compared with the structure of the original English version. The German version can now be stored in the database and gets the version number 1. It also gets a label to say on what original English version the translated manual was based.
 
Meanwhile the English version has been updated with all kinds of changed-, added- and removed texts.
 
At a certain moment in time a decision is made to update the German version to the current level of the English version. The system searches in the meta data of the German manual to see what the original English version was. The system also searches for the most current English version and creates a "Diff-file". The Diff-file contains all the differences between the two English versions. We send this Diff-file to the translator. The translator brings all changes from the Diff-file into the original German version, creating an updated German version. The translator sends this updated German manual to us and the SGML structure check takes place. The system then raises the German version number by one and also updates the label to show on which English version the translation is based now.
 
For all subsequent updates, the way of working is the same. The system takes care of getting the right versions, based on the meta data declarations in the translated manual.
 
In this way it is possible to start translations of every manual into every language at any required moment. Version control is totally managed by the system, so the personnel involved can focus on quality of the translation content.
 
 

Conclusion

 
Starting from the first functional specifications a system for processing the image material and versioning of translations has been designed. Implementing this, we've followed a pragmatic approach. The limitations imposed by both SGML and the selected system (Texcel Information Manager) were taken into account. In the final system, we rely heavily on a complex and vulnerable combination of metadata, scripts and workflow to reach the desired result, a workable system for the users.
 
In my opinion in SGML-systems a lot of things that should be handled by programs, are handled by scripts. The advantage is that you can build (almost) everything you want. The disadvantages are: performance (scripts are relatively slow), maintenance (the supplier must maintain the scripts made for just one customer), and expense (everybody is re-inventing the wheel).
 
For example version control of translations not a standard feature, but must be build into workflows with a lot of scripting. It needs a lot of effort to make specifications, to design and to test it. Close Co-operation with your supplier is needed.
 
The point mentioned before is also true for the handling of graphics. Why do suppliers concentrate on multimedia and XML while simple graphics are still not well handled? SGML is not poor on graphics, the standard (mainstream) SGML software is poor on graphics.

Stylesheet Driven SGML Transformation   Table of contents   Indexes   Microdocs, Birthrights, and Pottage Messes