Michel Biezunski - Invisible HyTime   Table of contents   Indexes   Peter J. Newcomb - The HyTime Property Set

Michel Biezunski and Catherine Hamon - A Topic Map of This Conference's Proceedings
 Biezunski, Michel 
 High Text 
 
Biezunski  Michel High Text, 
Hamon, Catherine
 High Text 
 Topic Map 
 
Hamon  Catherine High Text, 
 

A Topic Map of This Conference's Proceedings

 

Why does a Topic Map fit Conference Proceedings?

 The purpose of a Topic Map-based hyperdocument is to interconnect semantically heterogeneous information. Conference Proceedings seemed to us to be a good sample of a type of hyperdocument that is adapted to a Topic Map.
 Anchor  
 
A Topic Map allows readers to navigate following topics that can appear in multiple documents. Rather than just being a simple term, a topic is a link that contains a title and is pointing to places in the documents where there are occurrences of this topic. These places, otherwise called anchors, can be grouped following various roles they play, and the anchor roles orient the navigation (e.g., definition, mention, example, etc.).
Multi-document Indexes
 
A Topic Map is functionnally equivalent to multi-document indexes, glossaries, and thesauri. Topics are organized in types, each instance of a topic type has a title, and each occurrence of a given topic in a document is described including the semantics of the anchor role.
 

A HyTime-based HTML hyperdocument

 HTML, Hypertext Markup Language 
 
The current hyperdocument is encoded in HTML. Therefore it can be browsed using one's favorite Web browser. Even the table of contents and the different indexes are a set of HTML links allowing to navigate to any appropriate location in the hyperdocument.
 Topic Navigation Maps 
 
The architecture on which the hyperdocument is based conforms to the HyTime-based Topic Navigation Map specification. This specification, as it stands today, is being proposed as a committee draft for an ISO standard (ISO/IEC 13250). The architecture has been designed by the CApH Committee, chaired by Steven R. Newcomb. We have played an active role in the design of the architecture. Extensions and fixes are being proposed and will be discussed before it will be completed as an ISO standard.
 Ilink 
 
The idea of using semantics to help users find their way through the maze of links originates from HyTime, and more precisely from the ilink architectural form, because ilinks have a type (element type) and force semantics to be defined for each anchor or anchor aggregate (anchrole attribute).
 EnLIGHTeN  
 

A Topic Map created by computer

 High Text, SARL, is developing since the beginning of 1995 an application, called EnLIGHTeN, that provides an environment to help create, maintain and navigate topic maps.
 EnLIGHTeN is an application-specific way to deal with Topic Maps. It allows its users to index documents (either by hand or automatically or any combination of both), by declaring a list of topics. Each time a hit is found, a mark is inserted in the source documents. The current syntax we use is made of specific SGML comments (which really should be considered as processing instructions). The reason for this choice is to allow SGML users to insert topic marks in any document, without having to alter their existing DTDs and applications.
 EnLIGHTeN accepts 3 kinds of source documents, the three types can be mixed in the same hyperdocument:
 
     
  1. Plain vanilla ASCII
  2.  
  3. HTML (any version up to 3.2). The current version of EnLIGHTeN is limited to SGML-like documents (No DOCTYPE required, but tags should not be overlapping)!
  4.  
  5. SGML (any DTD)
 EnLIGHTeN allows users to declare any type of relationships between topics, that are declared using a specific syntax. These relationships can have any semantics, but are limited to one-to-one relationship, in the current version. When relationships are one-to-many, they are decomposed in a set of one-to-one.
 Document Type Definition 
 
Topic types and anchor roles are freely defined by users as parts of each index mark. EnLIGHTeN is a batch process which creates the HyTime Topic Map and the various HTML output documents from what it collects in the source documents. This is possible because the HyTime DTD and instance for the Topic Map are built on the fly. In the current version, the Topic Map DTD and instance are destroyed and re-created each time the hyperdocument is processed. We plan to use them as "control centers" of the connections within various integrated database-document environments.
 

Navigating an EnLIGHTeN-created hyperdocument

 WWW 
 
The current version of EnLIGHTeN outputs HTML format, to allow users to publish directly their documents on the Web, and take advantage of existing Web browsers. Any HTML-based browser can be used. No CGI, Java, or specific Web browsers are involved. The HTML output is parsable using the HTML 3.2 DTD (dated May, 1996).
 If the source documents were in HTML, and if they included graphics, sounds, video, special effects, ..., all those features are preserved and will still work in the output Topic Mapped-HTML documents.
 

EnLIGHTeN-created screens

 The table of contents (Contents ) is simply the list of the documents that are present in the hyperdocument. Clicking on any of them will open the document.
 The indexes (Indexes ) contains a list of topic types (i.e. persons, concepts, organizations, etc.). Clicking on one of them will open a given index, which is an alphabetic list of topics.
 The topic screens are built by EnLIGHTeN by grouping all relevant links or anchors pointing from a given topic. These screens are divided into two parts
 Anchor 
 
The documents are displayed with anchors, that let users know which topic instances are anchored at a given location. Clicking on each topic instance triggers the display of the corresponding topic screen.
 When there are more than one occurrence of a given topic with a given anchor role, then a pair of arrows [<-] and [->] are displayed around the topic instance. They allow users to traverse directly to the next (or previous) occurrence of the same topic, whether it is located in the same or in another document.
 The "info" option displays a report on the current state of the topic map. It gives information about:
 Trick: while you are reading a document, whenever you want to see either the table of contents or an index, click on any topic, and the menu [Contents|Indexes|Info] is always displayed at the bottom of the screen for each topic.
 

How this hyperdocument was created?

 We have asked each of the contributors to the HyTime Conference to give us a copy of their paper, in electronic form. Most of them were in ASCII, some of them were in HTML, one of them was in Microsoft Word and was converted into HTML. The graphics have been converted to GIF or JPEG format, to ensure compatibility with the popular Web browsers.
 Each author was able to insert his or her own topics as marks (SGML comments) into their documents, or alternatively to provide us with a list of topic that were considered relevant to the domain. We have added our own list.
 We have used EnLIGHTeN's Ascii to HTML feature to transform ASCII documents into HTML. Some formatting has been rebuilt "by hand".
 We have used EnLIGHTeN's automatic indexing feature to insert into all documents the topics that have been declared into some of them.
 The first version of this topic map was presented in Seattle, at the end of the conference. We received about half of the documents a few days before the conference started, and the rest the first day of the conference, i.e. just one day before the presentation.
 It takes us about 5 minutes to build the current HTML set of Topic Mapped documents with a 486-75 computer (Linux), and approximately the same time on a Sun-Sparc 20 Unix workstation. In Seattle, time was about 45 minutes. The code of EnLIGHTeN has been considerably rewritten since.
 In order to minimize the intellectual work required by indexing and choosing meaningful anchor roles, we decided that the standard basic anchrole was the first name of the author, in the scope of his/her own document. Thus, for example, the topics present in David's paper appeared on the screen as a list under the name of David. This allowed us to see for several topics, that they were mentioned by different authors. But this design was far from being ideal, because it creates a confusion between the document source and the anchor role.
 The current version of this topic map has been redone, without any change to the documents themselves. We have worked on several steps.
 

Notes about our design choices

 There are a lot of arbitrary decisions that we had to take here, and we are taking all responsability for the good (and bad) choices that have been done.
 Note 1: We do not consider the documents presented here to be in a completely finalized form. We have discovered by doing this work that the Topic Map architecture is so open that there are several very different choices. We are almost sure that every one of the authors (or of the readers) will have a different point of view on how to describe the information. Also, we have probably forgotten very important topics, and have misused others. The nice thing about it is that we are able to make any changes to the whole structure very quickly. EnLIGHTeN will recalculate any possible change that someone will want to introduce. If you want to have something changed in this hyperdocument sample, by all means, don't hesitate to tell us. We plan to update this document every week (topic map part only) if necessary, to improve it.
 It is also possible to declare multiple topic maps within the same set of hyperdocuments.
 A topic mark can be considered a giant bookmark. Therefore a set of documents can also be marked by different users, each of whom defining his/her own topic map from the same set. A future version of EnLIGHTeN will include this feature, making it an SGML-based equivalent to annotation managers.
 Note: In the currently designed HyTime Topic Map, each instance of an author element must have a fixed number of anchor roles, and therefore of anchor addresses. EnLIGHTeN has been designed in such a way that if an anchor role is not present, it will not be displayed, even if it is formally there. This remark has a general validity. EnLIGHTeN has been designed in a way that is useful for an end-user, and it doesn't simply reflect the HyTime structure. All these choices are not defined neither in the HyTime standard, nor in the Topic Map specification. They have been designed by High Text, and High Text bears the full responsability for this interface. Another tool using Topic Maps could look and feel very differently. This is why we feel it's interesting to have standards: future users of topic maps will (we hope) have the choice between different kinds of applications.
 

High Text's plans regarding EnLIGHTeN

 GCA, Graphic Communication Association 
 Steven R. Newcomb 
 
The Seattle Conference has been for us a "real world" live test experiment. We want to thank the organizers of the conference, namely Steven R. Newcomb and the GCA, to have allowed us to present this application, and the contributors to the conference, who all accepted to provide us with an electronic version of their paper. (The papers that are not included in this Topic Map have not been written in a deliverable form.)
 We are now proposing a Topic Map Express Service, which is the equivalent of a printing service. Instead of giving the customer business cards, we deliver HTML topic maps. This enables us to keep our software at home, because we are not equipped today to offer needed facilities, including customer support, multiple versions, upgrades, documentations in different languages, etc.
 This service is based on an initial consulting service (meetings, training, definition of an architecture specific to a given environment, ...), followed by a series of electronic document exchange: After you know how to encode your documents conforming to EnLIGHTeN's specification, you send us your documents, we process them on our machines, and send them back to you. If documents are liable to be sent through email, this could be a real quick service. Each time you need to have an update, we'll redo the processing and send it back. If documents should not be sent by email, we'll use fast-delivery international mail services.
 High Text is now working to set up the conditions that will make possible to finalize EnLIGHTeN as a end-user product.
 As a product, EnLIGHTeN will provide a bridge between databases and Internet/Intranet interfaces. EnLIGHTeN will be fed by documents stored in the database to automatically produce HTML pages according to a Topic Map.
 Specific developments derived from the core technology are also possible, such as integration within different databases, alternative input or output formats.
 Need more information? Send us email : enlighten@hightext.com

Michel Biezunski - Invisible HyTime   Table of contents   Indexes   Peter J. Newcomb - The HyTime Property Set