| Business benefits of an SGML and STEP integration | Table of contents | Indexes | The Role of Industry Standard DTDs | |||
| Biezunski Michel |
A Topic Map for SGML 97 Proceedings, A new SGML animal |
Abstract: |
| This paper explains what is a Topic Map and describes how we have made it for the current Cd-Rom. |
Introduction |
| It is possible to retrieve information withinindividual printed documents since a number of devices have been invented, such astables of contents ,indexes ,glossaries ,cross-references .Catalogs ,thesauri ,bibliographies are the tools that are used for browsing amongcollections of documents. Topic Maps are the standardized electronic solution ... and it is based on SGML. |
Topic Maps: a new SGML animal |
| Topic Maps are a standard representation of navigational information that is intended to be used for interchanging such devices as indexes, thesauri, glossaries, on sets of heterogeneous documents (structured or not structured). It can be thought of as the equivalent of a neutral database scheme, that should allow its users to preserve the value added on their information repositories with semantic navigation. |
| Typical users of Topic Maps include SGML users who need to maintain links accross living documents, while avoiding the overhead caused by maintenance of huge amount of data as systems evolve. Note that Topic Maps can also be used if source documents are not in SGML. |
| The conceptual basis of the Topic Map architecture is based on the possibility standardized by HyTime to separate the semantic information of a link from the address of the (possibly multiple) anchors. The architecture that has been designed will be updated to take into account new standard formalism being defined for links. An XML representation is planned as well. |
| The Topic Navigation Maps Project is a work done under the auspices of ISO WG8, the group responsible for SGML and related standards (Convenor: James Mason). The co-editors of this project are Martin Bryan (UK) and Michel Biezunski (France). |
| This work is the continuation of a project that has been initiated since 1992 first within the Davenport Group, where it was then known asSOFABED (Standard Open Formal Architecture for Browsable Electronic Documents) and since 1993 within theCApH (Conventions for the Application of HyTime) , an activity sponsored by the GCA and chaired by Steven R. Newcomb (TechnoTeacher, Inc., USA). |
An experimental Topic Map for the Proceedings of the SGML Europe 97 Conference |
| Conference Proceedings are a good candidate for showing the interest of Topic Maps. They are made by a series of papers written by a number of different people. Readers may wish to get quick access to subjects of interest, without having to go through the whole content. It is also possible to derive interesting navigational strategies from the very content of the DTD, such as a "geographic kind of navigation". By navigating the enclosed Topic Map, it is possible for example to find immediately where the companies to which the authors belong are located. |
| This Topic Map has been built to show some navigational possibilities that can be applied on a set of SGML documents exploiting the existing DTD. The potential of Topic Maps is greater, as it is aimed to organize navigation within sets of information objects, some of whose might be not structured as well, because Topic Maps are superimposed on a set of existing documents. Furthermore, multilingual navigation is now under study as part of the Topic Navigation Maps Standard, to enable navigation not only by choosing a given language, but also by expressing in different languages the constructs themselves, such as SGML generic identifiers or attribute values used for creating and navigating Topic Maps. |
| For the current project, we have decided to create topics which are a guess of the terms that one may use to navigate through each papers. Furthermore, we have created a "network" of interconnected information, between people, companies, and geographic locations. |
Topic Map Tool under construction |
| This Topic Map has been created using the EnLIGHTeN application, currently developed at High Text. The EnLIGHTeN project started in 1995 and was first designed of an illustration of what Topic Map Navigation may look like. After a while it became clear that this was in fact becoming a tool that could be useful in a variety of situations, and we decided to focus on a user interface to a link database that enables now the creation and maintenance of Topic Maps without requiring any previous knowledge of SGML and HyTime. EnLIGHTeN shows that users can focus on the semantics of the information, while leaving the tedious addressing tasks to machine processing. It is most useful in situations where there are a number of cross-references to maintain in an environment wher documents are constantly evolving. |
| EnLIGHTeN is used today for modeling topic map applications. It has been designed to be modular and extensible, and will become part of a variety of existing applications or applications under development. |
How we have created this topic map |
| To create a Topic Map, it is necessary to identify topic types, topic titles, and the roles that each of the topic play at a given occurrence. Then topics can be related with other topics, by means of various relations that can be created at will. |
| In technical terms, a Topic is defined as an SGML element, conforming to the HyTime architectural form for links. It is itself an SGML architectural form (as defined in ISO/IEC 10744), |
| EnLIGHTeN is a Topic Map creation tool which does not require a Topic Map DTD to be present before starting defining the topics and the relationships. On the contrary, the DTD is a document that is automatically produced with the instance by collecting the topic annotations created by the authors of the Topic Map. |
| The basics of the creation of a Topic Map is to identify topics throughout the documents. Once topics have been identified, they are automatically grouped together, and each instance is linked to the others. This approach eventually fulfills the same task as cross-references, but instead of saying "see also somewhere else", we say: "here this is about a given topic". The software is finding the addresses, not the user. If the address changes, the whole topic map is re-calculated automatically. |
| The second task is to create a set of relations that are independent from the documents and that are applied to a given document set. For example, the fact that Spain is a country belonging to Europe is independent from the occurrences where it applies. Therefore, creating relationships between topics is like creating an independent knowledge base, that is maintained and updated separately. The possibility to be able to apply without any effort previous work made by a company to add value to its information set can be considered of main interest for the commitment to Topic Maps. |
| Here are the steps we have followed: |
| Deriving information from the DTD |
keyword
, has been provided in the DTD for this purpose.
|
fname
andsurname
and the topic "author" used in the Topic Map results from the concatenation of the two elements "surname" and "fname".
|
Output Options |
| As EnLIGHTeN is an SGML application, it requires at some step documents to be in SGML. If, like in this application, source documents are provided in SGML, they must be fully compliant, i.e. parsable SGML. A planned extension of EnLIGHTeN will also work with XML documents. |
| The output format for displaying documents has been chosen in this application to be HTML. EnLIGHTeN has features built-in allowing to create a set of HTML documents that group the information on the Topic Map: tables of contents, indexes, lists of relations, dictionaries, etc. |
| The transformation of source SGML documents to HTML is performed using James Clark's DSSSL Engine (Jade). A DSSSL specification has been written to format the source SGML documents into HTML. The links necessary for navigation from the documents to the topic map screens and vice-versa are added by a processing made by EnLIGHTeN. |
| Other output formats are available or under development. Any future changes to the HTML specification will be taken into account by changing the DSSSL spec that is used for outputting the documents. |
When several standards work together |
| EnLIGHTeN illustrates what can be done with the Topic Navigation Maps standard and shows how SGML, DSSSL, HyTime, HTML (and soon XML) can be made work together, by showing respective advantages of each of the standards, in the following way: |
Current opportunities |
| "ISO 13250 -- Topic Navigation Maps" is planned to be published as a standard at the end of 1998. This is an open initiative, under the auspices of ISO. Any contribution, addendum, user requirement, is welcome. The architecture will be extended to support multilingual information objects. |
| Specific Topic Maps have already started to be designed to enhance maintainability and long-term evolution of document management. A methodology exists for the creation of Topic Maps. They have already been successfully tested for the energy industry, legal publishing, financial industry. New applications are planned. Other sectors are welcome, where heavy document management is an issue: libraries, reference documents, navigation in archives, among others. |
| Industrial-strength software based on Topic Maps will open a new generation of applications, where databases, relational or object, with complex queries will be integrated with document creation software. In some of the implementations, SGML will become a resource for machine processing, while in others a fully developed user interface based on SGML tools will give their users full control over their structured documents. |
| Prototype applications are interesting to build now, because they might provide new requirements for the generic standard architecture, as well as a list of user requirements for the applications to build. If you are interested by this approach, this is the best moment to join. |
| Business benefits of an SGML and STEP integration | Table of contents | Indexes | The Role of Industry Standard DTDs | |||