![]() |
Topic Map cartography | Table of contents | Indexes | Registries &, repositories | ![]() |
|||
The "GPS of the information universe" |
| Topic Maps in an encyclopedic online information platform |
| Wittenbrink, Heinz |
| Heinz Wittenbrink |
| Product Development |
Germany ![]() Munich ![]() www.wissen.de | www.wissen.de,
Leuchtenbergring 20 Munich D-81677 Germany Phone: +49 89 748515-31 Fax: +49 89 748515 89 email: heinz.wittenbrink@bertelsmann.de web site: www.wissen.de |
| Biography |
| Abstract |
A topic map architecture for an encyclopedic website |
| It is the purpose of this presentation to show how topic maps are used for the information architecture of an online encyclopedia. The german encyclopedic website www.wissen.de shall be used as an example of a large scale implementation of the standard. In this paper I will describe a model of the use of topic maps in an encyclopedic application; this model is not yet completely implemented in our existing application and will certainly undergo some revisions during the remaining implementing process. I will not discuss desiderata of the standard but simply show that it works and that the basic concepts of the standard are highly useful for the organization of a very complex information network. Topic maps in www.wissen.de allow or will allow: |
Knowledge versus information - topic maps and the functionality of online-encyclopedias |
| encyclopedias |
From the beginning the elaboration of the topic map standard seems to have been undertaken with the special requirements of encyclopedia publishing in mind. Encyclopedias are commonly referred to as one of the major application fields of topic maps
. For encyclopedic publishing it is necessary to organize an extremly high amount of information resources comprehensively - articles, statistics, tables and illustrations -, to keep track of tenthousands and more cross references and to update multivolume works on a regular basis. Many encyclopedia publishers are working on several different A-Z encyclopedias in parallel, e.g. a one-volume, a ten-volume and a twenty-volume version. Experience shows that it is very difficult to synchronize the updating process and that a lot of work is usually done twice and more, as long as there is no "abstraction layer" as it is provided by topic maps.
|
| In general, the structural information conveyed by topic maps includes: groupings of addressable information objects arount topics (occurrences), and relationships between topics (associations). |
The implementation of topic maps in wissen.de |
| topic-ID | The starting point of the repository of topics in www.wissen.de was an A-Z encyclopedic dictionary with about 120.000 entries. Roughly spoken each of these entries corresponds to a topic. This topic serves as an anchor around which information can be organized - "vertically" by adding information resources about this topic and "horizontally" by connecting it to related topics via "associations". The topic map standard requires a unique identifier attribute for every topic. We used keys from our legacy editorial system to get these IDs. Probably we will have to replace these keys in the future by strings using the international basename of each topic. |
| basename displayname sortname | In order to allow an internationalization of www.wissen.de, the basenames of the topics will be replaced by international, meaning English names. The sort- and displaynames are up to now only in German. The scope-attribute of the topicnames contains a reference to the language. This means that other language versions can be integrated as a sort of additional layers by adding display- and sortnames of the topics and occurrences in the respective language. In the future, foreign partners can also extend the www.wissen.de topic database by their national content. An Italian encyclopedic site may add a lot of italian subjects, that will also be accessible to the German user. |
| topic-type |
For the attribution of topic types to the subjects covered by the A-Z encyclopedia a list of about 40 different entry types was used. Examples of topic types are "person", "ruler", "state", "animal" etc. Up to now we have not defined "superclass/subclass"-relations between these types.
|
topic-associations ![]() | The real power of topic maps resides in the associations between topics - typed links showing which relations exist between the subjects of an information resource. In a first phase we had to rely on informations already existing in our databases in order to establish a network of associations between the topics of www.wissen.de. Such informations are contained in the category index, in a geographical and a chronological index and in the existing cross references. The relation between a topic and a category is an association of the type "belongs_to_category". The discipline-topics are connected by "is_part_of"-associations. "Particle physics" is for instance a part of "physics". From the existing data it was also possible to take the cross references between articles and to use them for a "is_referenced_by" association. This association will have to be specified in the future because it is not really typed. A geographical and a chronological index allow some additional basic associations as "is_situated_in", "belongs_to_epoch" "happened_at_date". We are now in the process of adding editorially further associations. In the end most of the information contained in the A-Z-encyclopedia will be transformed into topic associations. We use regular expressions to extract the informations, e.g. to transform typical recurrent verbs in the articles as "wrote" or "consists of" into typed associations. Combined with a relational database containing all the "hard facts" statistical information this topic map database will be an electronic equivalent of a traditional encyclopedia. It will allow presentation of the content with graphics and tables that are easier to understand than traditional textual descriptions of usually rather boring facts. |
Search of a single topic |
| searchname | Topics are used in order to provide precise answers to the questions of the user. The general principle of the search in www.wissen.de will be "one query - one answer". The default search is done via the search names of the topics of the encyclopedia. Only if this search has no positive result the full text index is used. Normally the user gets one and only one structured result set provided by a topic link to the different occurrences of the topic. |
| If two or more topics with the same search name are found, the user has to choose in a list the subject he is interested in. The list shows also the topic type. In the case of Mars the user would have to decide whether he wants information about the planet or the roman god. |
| Only if no topic with the search string "Mars" is found (or if the user decides himself to start it), a fulltext search is triggered. But the results of this fulltext search are structured as topic occurrences. |
Occurrence roles and Xlink |
| topic-occurrences |
All informations in www.wissen.de are handled as topic occurrences. This means that |
| Topic maps constitute the model for all kinds of links used in www.wissen.de. All HTML-links visible to the user are generated on the base of topic- and association-links stored as Xlinks in the database. Topic maps allow the administration of the links; new content is integrated into the site by links to existing - or new - topics. When new topics are introduced, they are connected by topic associations with the existing topics. So it is basically via topic maps that www.wissen.de constitues an information network. |
Wissen.de integrates different types of information. The goal is to give access to different levels of information about every topic. The user can decide on the depth of information he is interested in. Examples of occurrence types and roles are:
|
About the planet Mars e.g. www.wissen.de will contain: a short article on the planet with basic information; an explanation of the word "mars" and its ethymology, several chronicle entries about the exploration of Mars, a long article about the solar system where the planet Mars is treated on several pages, another longer article about space probes, pages from the ESA, the German Museum in Munich and a Max-Planck-institute, hints to a TV-coverage of the solar system, dates for possible observations of Mars, an expert chat about the possibility of life on Mars, links to the best internet sites with information about Mars. These are different occurrences of the topic "Mars". The display of these informations is controlled by the layout the user has chosen, by the topic type and by the role of the occurrence.
|
| occurrence-role occurrence-type | The occurrence roles determine how an occurrence or a link to an occurrence is displayed on the screen. The HTML-templates contain a reserved default space with a headline for all occurrence roles: In the right of the main content display there is space for longer encyclopedic features, news, internetlinks etc. These fields are filled with occurrences of the topic in case those occurrences exist. IF THEY don't exist, the fields are filled with occurrences belonging to the next upper node in the category tree. If the user types in "Mars" the application looks for occurrences of the topic Mars with the role "news". If it doesn't find any news about Mars it displays all existing news about the solar system because "solar system" is the category Mars belongs to. If there are no news about the solar system news about astronomy are displayed, if there are no astronomical news, news about natural sciences and so on. |
Xlink ![]() | What appears to the user as a traditional HTML-link frome one wissen.de-resource to another internal resource is "behind the scene" in the database an extended Xlink connecting two occurrences of one and the same topic. Inline links are either links to topics or to topic occurrences. A link from the article about Mars to the astronomer Schiaparelli who discovered the Mars channels is generated by treating Schiaparelli in the article as an occurrence of the astronomer. The role of this occurrence is "mention". The default behaviour of links of this type is to open a small window with the short encyclopedia article about the mentioned topic. Links from a section in a document to a section in another document will be realized by treating both sections as occurrences of the same topic. The Mars channels for instance are a topic on their own. When this topic is also treated in a passage about earthbound astronomical observations in another text, a link between both passages can be generated. |
| Thus the topic map architecture allows a semantic treatment of all the links contained in www.wissen.de. The links are bound to topics and have a precisely defined functionality either in presenting the relations of one topic to another topic or in specifying the function of an information resource for a topic. Whereas the latter could also be done via Xlink attributes, without the topic structure links could only connect the information resources without specifying the semantic basis of the link. The topic map structure will comprise all kinds of content on www.wissen.de (in the moment a lot of content, easpecially from partners, is integrated before the appropriate topic map structuring is done). Each piece of content will have a defined occurrence role. |
| The default display of content is the A-Z encyclopedia article with links to related occurrences. But in many cases the users will not be interested primarily in an encyclopedia article. School students will look for learning stuff; people interested in cooking will look for recipes, other users will want to get directly informations from members of the www.wissen.de-community about e.g. certain kinds of products. When travel guides are added to www.wissen.de, many users will be interested to get the travel information directly without having first to pass by encyclopedia articles. Therefore the user can choose directly a content region as - in the moment - "school", "how-to" or "opinions". If he does so content of the appropriate occurrence type is displayed in the center and in a template with a special layout. The search is restricted to topics of specified categories and/or occurrence roles. Nevertheless the user has via links access to all other occurrences of the topic. |
Associations and specified search |
topic-associations ![]() | The advanced or specified search makes use of topic types and associations. Theoretically every possible combination of types and associations can be used as search input. It is possible to search for all "persons born in France in the 18th century" or for all "mammals living in India". This type of seach is in the moment realized via a selection in lists. In a list of topic types the user has to select the type, in a list of content categories he can select the discipline etc. He can also select associations, e. g. "born in" and type in associated topics. The result is a list with all the topics possessing the selected properties. It is one of the main targets of our software development to allow this type of search via a natural language input, because input fields and lists are not flexible enough for complex queries. The input string should simpy be something as "Which philosophers were born in France in the 18th century and travelled to Britain?" |
Semantic networks and visual navigation |
| association-types | Typed associations are the most visible and spectacular feature of topic maps. Topic links with multiple, typed occurrences, an id, searchnames and display names, a basename and types are extremely useful for the establishment of a complex index, but for the user they act in the background and do not much more than to facilitate his search. Associations make it possible to construct information spaces, to show which subjects are connected between themselves and in what the connection consists. Topic map supported information applications allow therefore new ways to navigate in information universes. |
| The traditional encyclopedias in electronic formats allowed only a keyword or fulltext search. The fulltext search con be refined by Boolean operators. The topic search as it is realized in www.wissen.de allows to find information that is not retrievable by keywords. A search for "solar system" in a traditional encyclopedia yields the article about the solar system and all the sparsed occurrences of the string "solar system" in the texts of the encyclopedia. It will not find for instance an article about comets if this article does not contain the words "solar system". The topic search finds the different information objects about the solar system as such as well as the different "components" of the solar system, e.g. planets, asteroids, comets etc. |
| Associations from one topic to other topics are represented as a list of related subjects, classified by the association type, or graphically by a java applet called "visual index" that shows the associations as labeled arrows and the topics as nodes. In our example these links would point from "mars" to the other planets (topics of the same topic type), to "planet(s)" (the topic that is the topic type of "mars"), to the moons of Mars Phobos and Deimos etc. |
The applet gives the user the freedom to decide which associations he wants to see. If he looks for a town, he can for instance look only for the people who were born in this town. In the topic map standard associations are themselves topics that have types. In combination with the types of the associated topics this makes it possible to switch whole groups of associations on and off.
|
Complex topics, scope and the organization of information |
| Usually examples of topic maps use topics that have proper names and very simple associations, e.g. "born in". A very large part of a normal A-Z encyclopedia with short articles can be represented by this kind of topics and associations. In www.wissen.de they constitute the basic network of the overall topic map structure. But as soon as topic maps are used as metadata for longer encyclopedic and scientific texts, other types of topics and associations have to be introduced. The existence of life on Mars for instance is an important topic, whose type may be "scientific problem". |
scope ![]() | The attribution of a scope to the associations of such complex types can be used for a structured display of information resources. If the topic is "Classical Greece" it is possible to use scopes as "basic", "school knowledge" and "scientific archeology" to show which topics and which topic occurrences are related in which context to the subject. |
Personalization and collaborative features |
personalization ![]() | Personalization of information is one of the important aims of www.wissen.de. The user should get exactly the information he needs in a specified time. It should normally not be the task of the user to search for the interesting pieces of information in an enormous amount of useless matter. Registered users of www.wissen.de can decide themselves about important aspects of the information that is displayed. They can choose which categories of information they want to see on the start page, about which subjects they want to be informed in newsletters etc. But even users who are not registered can receive a high amount of personnalized information if the application interprets the actions of a user as informations about his interests and preferences. That means that the display of information is controlled by rules which take the interaction of the user with the application into account. The user who selects on the start page a feature about the Sun is most likely interested also in other astronomical items. So he will get the feature about the Sun in the central panel, surrounded by links to the other occurrences of the topic sun, but also by links to other astronomical news, features etc. This content must be generated dynamically. In www.wissen.de the personalization of content is not yet completed. In the near future the application will look for topics that are as similar as possible to the topic selected by the user, and it will at the same time look for occurrences of the same category if there are no directly related occurrences. To a user who looks for Florence the application will also offer links to Venise and Milano. The quality of this personnalized access to information depends on the denseness of the semantic network of the application. The associations, types and occurrence roles in the topic map model allow it to determine very precisely where information objects are similar and where not. Therefore they provide an ideal base for personalized sisplay of information. |
| A lot of basic topics and topic association will be common to many different applications. It is an open question whether it makes sense to maintain topic maps as propriety of a company or whether they should be developed in an open source model comparable to the Mozilla and the Open Directory Project. |
| Acknowledgements |
| Many thanks to Hans Holger Rath (STEP GmbH, Rimpar) and Holger Hvelplund (TEXTware AS, Copenhague) for the introduction into the subject and the patient answering of numerous newbie-questions. |
| Bibliography |
|
|
|
|
|
![]() |
Topic Map cartography | Table of contents | Indexes | Registries &, repositories | ![]() | |||