XML-Data for Interchange Definition   Table of contents   Indexes   High Level Architectures of Document-Object Publishing Systems

 
 

Introducing the SGML Technology at the Publishing Houses of Wolters Kluwer Hungary


 
Henk   Ursinus
  IT Manager
  Wolters Kluwer Hungary Kft.
Prielle Kornelia u. 19-35
Budapest   Hungary  H-1117
Phone: 00 36 20 575 011
Email: hju@wkcee.hu
 
Biographical notice:
 
Henk Ursinus
 
After two years of system development in a PC-AS400 environment in a local government office in the Netherlands Henk Ursinus was manager of the Hungarian branch of a Dutch software company, which is engaged with EP developments of large databases. Since November 1996 he has been IT manager of the Hungarian branch of Wolters Kluwer. One of his tasks is to advise on EP and he has co-ordinated the installation of an SGML document management system. Today Henk is working within Invenció Kft. the support company for Wolters Kluwer Hungary.
 
ABSTRACT:
 
Part of Wolters Kluwer Hungary and part of the Wolters Kluwer International Group comprise a Hungarian Scientific Publishing House and a Textbook Publishing House, a book wholesaler and a service company giving IT support. Since 1 December 1997, the company has been using an SGML document management system to manage and prepare the production of a literature encyclopedia and three large bilingual dictionaries, creating both electronic and paper products. This presentation provides a snapshot of the progress to date.
 
Information Management System
Wolters Kluwer
 

The start

 
In the mid 1990s Wolters Kluwer entered the Hungarian market with the purchase of existing publishing houses. In keeping with Wolters Kluwer tradition, these were a Legal folio publisher, a Legal electronic publisher, an Academic publisher and a Textbook publisher. Based on the experiences gained in the management of other companies, it was decided at a very early stage that the Hungarian group should establish a fully automated Information Management System. With the exception of the legal electronic publisher Invencio, the publishing houses had minimal know-how with respect to automation. For this reason, publishing tasks previously performed by Invencio were transferred to the legal folio publisher, and Invencio became the general facility company for the whole Wolters Kluwer group.
 
As a result of the hard lesson learnt by the Academic publisher who was completely at the mercy of a small local typesetting and electronic publishing company which kept hold of all existing digital information, WK Hungary chose to be independent of hardware and software manufacturers and of final product developers. This meant that an open-ended database structure had to be built up, and SGML was the obvious choice.
 
 

The first steps

 
The first obstacle we had to overcome was a fear of automation among a personnel used to a traditional style of publishing. The printed manuscripts which arrived were corrected on paper, sent back to the author, returned to the publisher, and finally sent on paper to the typesetter, who typed the information in again and then made films. The files were deleted so that the disks could be used again. The use of computers in editorial work was minimal.
Hungary
 STEP 
 

In spite of its long history, we did not find SGML to be a generally adapted format and one and a half years ago complete information systems were not really available. Nevertheless, we got in contact with the German company STEP (Sturz Electronic Publishing GmbH), which had a longer experience in building systems based on SGML and creating complicated DTD's for several lexicons. Fortunately, STEP had a daughter company in Hungary , which proved to be a huge advantage in terms of language. We asked them to do some basic research within our companies, to learn which products would be suitable for SGML use. STEP also gave a presentation about SGML and the idea of automation was slowly accepted among the personnel.
 STEP  
 

In the middle of last year we decided to start a pilot project in SGML. The Literature Lexicon of the Academic Publisher needed to be renewed. This is a typical product for a SGML database: it involves several authors, it requires regular updating, and it had also been decided to publish it electronically. We found an editor who was willing and able to renew the lexicon. After training in SGML he and STEP built a DTD. STEP converted the original lexicon content from MSWord to SGML. After that the editor started to update the database. During the conversion process, it became clear that a lot of information in the original text which was used for book printing was not relevant for EP. The text contained several different abbreviations for the same object, dates with different purposes, redundant references, etc. While this was acceptable in folio publication, it was useless in terms of EP. Many manual corrections had to be made before the data was structured enough for electronic search. The editor became so enthusiastic about SGML that we asked him to hold a presentation for all personnel concerned with product development. This event brought us even closer to the general acceptance of SGML.
 SigmaLink 
 

In the meantime, STEP launched its standard information management system, SigmaLink , and after comparing its features with other existing products we opted to install this system. SigmaLink integrates several existing products such as editors, viewers, browsers, database-managers etc. The main reasons behind the choice were the open structure of the system, STEP's long experiences with SGML from its publishing background, and the availability of Hungarian support. We decided on a UNIX based version. Although this was an absolutely new technology, we preferred it to the unstable Windows NT area. As the Facility Company, Invenció was given the task of system management and supervision.
 
 

The use

 
After installing SigmaLink, the lexicon was imported. The database is ready for EP development at any time and the last updates can be made right up until the production of the CD.
dictionaries
 

The first large project we have started in SigmaLink is the renewal of the three big bilingual dictionaries . We first had to digitise the dictionaries. The last versions were produced more then 25 years ago and existed only on film. Not only are the dictionaries being developed in a totally new technology, the time limit was also extremely short. Collection of the new content on paper started in the middle of last year. A 16-user SigmaLink-configuration was installed at the beginning of last December. A totally new department was created for this system. Before any original content was imported into the system, the freshly recruited editorial staff and our own employees were trained in SGML technology and SigmaLink use by personnel from STEP. A planning-desk was created for the division of workstation time between the various editors. Many editors decided to work at home. For this purpose, they were given the use of a computer with the WordPerfect word processor, because this had the only SGML editor at a reasonable price.
 
The digitising of the original dictionaries was done in India according to typing instructions that STEP designed as a pre-DTD. For the conversion of the digitised material to SGML, a converter has been developed in the program language C. During the conversion we came up against the first problem. The content of the dictionaries differed from language to language and was not consistent within a dictionary. A lot of manual work had therefore to be done which caused the first delay.
 SigmaLink 
 

After the converted material was imported into , the editors were able start their part of the project. Home-based workers were given exported files to correct, while others worked on site. As the content was received into the database, it became clear that the editors were not working in exactly the same way. A regular help-desk had to be set up to offer permanent assistance.
 
Another problem was caused by the system itself. SigmaLink is a very new system and because we were not experienced either, the system did not always work at its most efficient. Furthermore, it had not yet been used for dictionary input with the particular demands which that entailed. Every headword appears in the database as one element. In SigmaLink, however, we handled it as a file, in order to be able to filter different queries. Checking-in and checking-out files greatly slowed the system and we had problems with the ordering of the files as well. The editors were not able to find the words they had typed in or imported, and we were not able to print word lists in alphabetical order for correction. STEP nevertheless gave us full and dedicated support in solving these problems.
3B2
 

For desktop publishing, we decided to purchase the SGML supporting 3B2 . At present our DTP staff are training themselves for this software.
 
 

Conclusion

 
This is not yet the end of this particular project. At the time of writing, STEP is installing the newest version of SigmaLink. They have assured me that most problems will be solved. At the time of the SGML conference the material for the English-Hungarian dictionary will be at the printer. Apart from some test listings we do not yet have many results on which to judge the effectiveness of SGML. One thing is sure, however. Without this system, we would not be able to work with 90 editors in one database with uniform results. At the conference, I will be happy to show you some concrete results.

XML-Data for Interchange Definition   Table of contents   Indexes   High Level Architectures of Document-Object Publishing Systems