Infoloom
Semantic Integration Technologies
Michel Biezunski
Brooklyn, New York
mb@infoloom.com

Understanding Topic Maps

The Problem

The amount of information available on line has been doubling in size every 6 months for the last 3 years, with equally rapid growth predicted for years to come. Today there are 42 million active Internet users, and the potential is an estimated 270 million. In addition to a skyrocketing number of users, the number of pages is rapidly increasing as well. A study conducted at the NEC Research Institute estimates that the Web has over 800 million searchable pages. Search engines do not keep up with the rapid explosion of Web pages. NECRI tells us that it takes more than six months for a new page to show up on a search engine. Even one of the best engines, Northern Light, only searches one-sixth of the Web's pages.

The lack of organization on the Web means that the average Web user can issue a multitude of searches without retrieving a meaningful result. In addition, the distractions of the open Web make profitable business use of the Web even more difficult. Information providers need a common way to further qualify their information to enable the next generation of search engines to deliver much better results.

Corporations suffer knowledge management problems as well. Here the issue involves the corporate practice of establishing of large "data warehouses" to store all corporation online. Here information access problems are compounded because corporate data is usually in a broad number of formats. Some of our most critical data is stored in legacy systems in proprietary data formats. Even if we can access the information we want, synthesizing the data into a meaningful knowledge base is nearly impossible.

Finally there is a growing need for knowledge management on a personal level. Anyone who has to deal with email, electronic schedulers and other productivity aids are finding that these may actually contribute to the information overload problem, not solve it.

A new map-based technology, known as "Topic Maps," has emerged to address the exploding knowledge management issues that face both corporations and individuals. But what is a Topic Map? And how does it work?

The Theory of Topic Maps

Whether on the Web, on our Intranets, or in our personal lives, our knowledge management problems begin with a diverse information base. We have email and electronic address books. We have spreadsheets, word processing text, and presentations. We have technical manuals and order entry databases. Information within our electronic data stores is varied and diverse. How can we possibly connect the information so that we can use it in a meaningful way?

One approach to connecting the diverse legacy information sources is to transform the data into XML and use the XML hierarchy and internal link tagging to provide information access. This approach has serious limitations. First it is extraordinarily expensive to convert legacy data into XML. Second, once this has been accomplished, we are limited to a static, hierarchical navigation within the document and simple one way links between documents. And this hard-coded navigation mechanism becomes unmanageable and cannot scale as the information base grows. As a result we suffer from many poorly designed links, indexes, and navigation devices that prevent users from finding the data they need to find.

In addition, the current linking and navigation mechanisms do not allow us to plan our navigation strategy. Rather, we must follow links blindly hoping they will take us to the information we so desperately seek. Finally navigation based upon the hierarchy, content, and links embedded in the data cannot easily be revised, updated, or altered when another navigational approach is desired.

Topic Maps provides us with a new paradigm for knowledge navigation and synthesis. Topic Maps is an emerging ISO standard that provides for the specification of a standard, interchangeable hypertext navigation layer above diverse electronic information sources. Topic Maps enable us to create virtual knowledge maps for the Web, our Intranets, or even print materials.

We have long understood the idea of creating style sheets to control the formatting and layout of information. Topic Maps introduces the concept of creating style sheets to control knowledge-based information access and navigation.

Applying Topic Maps

The hypertext navigation layer that can be specified by a topic map can model complex knowledge management relationships. Topic maps can provide customizable routes to information that help the user navigate digital resources efficiently. Since topic maps are not contained in the information (like XML or HTML tags and link elements), information navigation routes can be changed dynamically as we apply a different topic map to the underlying information resources.

Suppose, for example, that I have a body of medical literature. It may be in a variety of formats and include a variety of documents such as reports, case studies, papers, and journal articles. General searches of the large medical base may be time consuming and likely disappointing. In addition, new topics may have emerged key search elements since the data was first encoded.

Topic maps of the information can transform the information into a meaningful knowledge base. Further topic maps can be used to deploy the same underlying information set in different environments for users with different requirements. First look at a topic map designed by a cardiologist. Notice that the navigation patterns are particularly designed for that medical speciality.
Now lets look at a topic map designed by a pulmonary specialist. While there may be some topics in common, this view of the medical data is far different. The navigation is dynamically created by applying a new topic map to the information set. This is why we call topic maps "style sheets for knowledge".

Summary

Topic Maps provide knowledge system designers an exciting new way to organize information into true knowledge bases Because topic maps are defined using an internationalized and extensible interchange syntax, they allow knowledge to be visualized, captured, and interchanged. . The Web, Portals, and Intranets, are an ideal environment in which apply users this expressive, effective representation of knowledge.

Semantic Integration
Technologies
Consulting
Production Services

Customers
Customers' Products

Partners
Mailing lists
Organizations

Presentations

Bio
Contact me

© 2005, Michel Biezunski