| Eliot Kimber - Property Sets and Groves | Table of contents | Indexes | Michael Anderson - IETMs : An Interoperability Problem | |||
Delcambre, Lois M.L. Oregon Graduate Institute ![]() | Delcambre
Lois M.L.
Oregon Graduate Institute,
Beaverton
Oregon,
97006
Email: lmd@cse.ogi.edu
|
Maier, David Oregon Graduate Institute ![]() | Maier
David
Oregon Graduate Institute,
Beaverton
Oregon,
97006
Email: maier@cse.ogi.edu
|
| Anderson, Lougie Sequent Computer Systems | Anderson
Lougie
Sequent Computer Systems,
Portland
Oregon,
97291-1000
Email: lougie@sequent.com
|
Oregon Graduate Institute ![]() Reddy, Radhika Structured Map ![]() | Reddy
Radhika
Oregon Graduate Institute,
Beaverton
Oregon,
97006
Email: reddy@cse.ogi.edu
|
Structured Maps: Modeling Explicit Semantics over a Universe of Information |
Abstract |
Data Access Structured Map ![]() Topic Map ![]() | The overwhelming accessibility to data, on a global scale, does not necessarily translate to widespread utility of data. We often find that we are drowning in data, with few tools to help manage relevant data for our various activities. This paper presents Structured Maps, an additional modeling construct at a level above available information sources, to provide structured and managed access to data. Structured Maps are based on Topic Navigation Maps, defined by the SGML community to provide multi-document indices and glossaries. |
Application Universe Data Structures ![]() HTML, Hypertext Markup Language ![]() Information Universe ![]() Intranet ![]() SCEL ![]() Sequent Corporate Electronic Library ![]() Structured Map ![]() | A Structured Map is a modeling construct that provides a layer of typed entities and relationships where the entities can have typed references to information elements in the Information Universe. The type structure introduces semantics so that we know what sort of entities are being tracked and why various references have been made. Structured Maps can be placed over loosely structured data, e.g., document collections, with references at various levels of granularity. Structured Maps directly support new, customized, and even personalized use of the information. In this paper, we define Structured Maps and present several examples adapted from the Sequent Corporate Electronic Library (SCEL), an intranet resource currently implemented in HTML. |
1. Introduction |
WWW ![]() World Wide Web ![]() | The amount of information represented electronically has been expanding since the beginning of the computer age. But our ability to access information has recently expanded to a global scale, through the Internet, the World Wide Web (WWW), and other technologies. Recent reports estimate the doubling rate of the WWW as around 6 months [1] . |
Data Mining WWW ![]() | One serious challenge presented by this volume of accessible information is organizing relevant information so that it is easily available for specific uses. As we encounter information, we often wish to remember it, e.g., with a bookmark on the WWW, or annotate it, e.g., with our opinion. We may wish to collect various information references together. As an example, a marketing analyst might collect references to web pages about relevant product descriptions as part of a marketing research project. We may wish to distinguish references to information according to various criteria: the references that support a particular point of view and those that argue against it. |
Structured Map ![]() | In general, we see this problem as a need to provide explicit, semantically tagged guidance to a given universe of information. We present in this paper the concept of Structured Maps to provide various organized, customized supplements to existing Information Universes. The definition of Structured Maps is adapted from Topic Navigation Maps [2], defined by the SGML community. |
1.1 An Analogy |
Road Map ![]() Structured Map ![]() | Structured Maps share certain properties with conventional road maps. Consider the following excerpt from a road map [3], shown in Figure 1. The legend defines several types of entities, each with an iconic representation. Circles, in various sizes, represent cities and indicate their populations. An airplane silhouette represents an airport. Roads are indicated by various line styles, corresponding to the type of roads shown on the map. Other types of entities can appear on the map, as shown in the legend. |
Road Map ![]() | Excerpt From a Road Map: Map from the Road Atlas 93 1992 by Rand McNally, R.L. 96-S-158 (with permission) |
| Figure 1 |
Database Schema Information Universe ![]() | Structured maps can be viewed as an analogue of a conventional map, where the world being modeled is the space of online information, which we term the Information Universe. The counterpart of the legend is a definition, using a database-like schema, of the possible entity types and relationships between them. The main part of a Structured Map, corresponding to the graphical part of the road map, consists of instances of these entity and relationship types, where the entity instances have labels. Each entity instance is also explicitly connected to the item(s) in the Information universe it represents. This aspect is in contrast to a conventional map, where the connection between an entity instance and its referent is implicit via similarity in geometry. |
Entity ![]() Entity-Relationship Model ![]() Relationship ![]() Structured Map ![]() | These ideas are illustrated by the Structured Map depicted in Figure 2. The definition of the map is shown as an Entity-Relationship model [4], in OMT notation [5], at the top of the figure. The middle part of the figure represents entity instances, represented as rectangles with rounded corners as in OMT notation and instances of relationships, represented by labeled lines between entity instances. The entity instances that appear in the Structured Map conform to the types that appear in the Structured Map Definition just as the symbols that appear on a road map are generally listed in the legend. The bottom layer of Figure 2 represents the Information Universe, consisting of one or more information sources. The lines from the middle layer of Figure 2 to the bottom layer represent the correspondence between entity instances and items in this universe. |
Information Universe ![]() Structured Map ![]() Topic Map ![]() | For this example, entity types are Painter and Painting. The documents in the Information Universe are various books and articles about European artists. The only relationship type has the roles painter-of and painted-by, for the two ends of the relationship, with the obvious semantics. We see a connection between the artist, Gentile Bellini, and a paragraph in an encyclopedia that describes him. Similarly, we see a connection from "Miracles of the Relic of the Cross" to a paragraph in a guide book about art that describes the painting. Note this representation of a Structured Map is somewhat simplified, to support the analogy with road maps. A complete definition is given in Section 2. Also, Figure 2 is meant to show the various parts of a Structured Map; it is not the form in which a Structured Map is shown to users. (This example is adapted from one illustrating Topic Navigation Maps [6].) We envision that Structured Maps might be rendered in various ways. A road map, on the other hand, exists in a conventional, graphical rendering. The model behind the road map is partly explicit (in the symbols that appear in the legend) and partly implicit. |
Structured Map ![]() | Simple Structured Map |
![]() | |
| Figure 2. |
Information Universe ![]() Structured Map ![]() | As with conventional maps, the same "legend" (entity and relationship types in the Structured Map Definition) can be used for Structured Maps of different regions of the Information Universe, with the relevant entity instances and relationship instances of interest appearing in the Structured Map Instance. Likewise, it is possible to construct several Structured Maps over the same Information Universe, but with different entity and relationship types, just as there can be different kinds of conventional maps of the same region. As with conventional maps, a Structured Map can be more or less complete or accurate. There is no guarantee per se that entity instances in a Structured Map are linked to all occurrences of relevant items in the underlying documents. Also, whereas an entity instance on a conventional map typically represents a single real world occurrence, an entity instance in a structured map may be connected to multiple items in the Information Universe, and different kinds of connections may be distinguished. For painters, we might distinguish information items that provide biographical information from items that simply mention the painter. In contrast to conventional maps, which are typically of uniform scale, entity instances in a single Structured Map can be connected to multiple items in the underlying universe and they can be of different granularities. For example, an entity might link to an entire document, a section, a paragraph, a diagram, a caption or even a single word or phrase. |
Information Universe ![]() Structured Map ![]() | Structured Maps provide an abstraction of the universe of information and a basis for semantically rich navigation in that universe much like a road map provides an abstraction of the real world and a basis for navigation by car in the world slice that is represented by the map. On a road map, we can travel from a particular city to a particular park by selecting from among the available roads that interconnect them. In an analogous manner, we could navigate from a city to its artists (if the Structured Map included a City entity type and the Born-In relationship type) and then from the artist to his or her works of art. At each step along the way we can stop and see the available detail about the information objects that are referenced. On a car trip, we can stop in any city and see the sights, many of which are not shown on the map. For the Structured Map, we could, for example, browse the guide book document when we arrive at the "Miracles of the Relic of the Cross" instance of the Painting type during navigation. The guide book clearly has much more information than is represented in the Structured Map. |
Entity ![]() Entity-Relationship Diagram ![]() Relationship ![]() Structured Map ![]() | One significant difference between roads (on a map) and relationships (in a Structured Map) is that roads correspond to actual physical roads but relationship instances are represented electronically. In effect, the designer of a Structured Map can create new "road types" and build new "roads" simply by defining and populating a Structured Map. We observe that although roads provide a navigational path to non-road objects (much like relationships provide navigational paths to entities), the legend of a map is not drawn like an Entity- Relationship diagram (ERD). Rather the road-type and the non-road-type icons are simply listed in the legend of a road map. Another difference is that all entity types shown in a Structured Map Definition have a single, database-style attribute, often called Title. Titles appear on a road map (as a label beside the icon) but the legend of a road map does not generally indicate the possibility of the Title attribute. Finally, note that in order to browse a Structured Map, we require some sort of display or visualization of the underlying information. For a road map, we are accustomed to a conventional, two-dimensional graphical display with iconic representation. |
Structured Map ![]() | This paper presents Structured Maps as a means to introduce explicit semantic structures over heterogeneous information sources. We see Structured Maps as distinct from and complementary to search engines that provide text searching and index capabilities. A full text search, for example, might identify every place where the string "Bellini" appears in a set of documents. Such a search is complete in that all places where "Bellini" is found will be indexed. A search is normally syntactic, lacking the ability to recognize places where the characters "B-e-l-l-i-n-i" appear but are not describing the painter of interest. Finally, a search is usually flat or homogeneous in the sense that all text can be searched and all references will point to the string "Bellini", in context. |
Reference ![]() Structured Map ![]() | Structured Maps, on the other hand, hold explicit references. They may or may not be complete. The references may be more accurate than those found by a search, assuming that only the actual descriptions of Bellini are referenced by the Structured Map. Structured Maps can also reference information items that discuss the painter but where the string "Bellini" does not actually appear. The items referenced from a Structured Map may be of any granularity and may have its own (local) semantics. A Structured Map can reference a chapter, a section, a paragraph, etc., as appropriate. |
Structured Map ![]() | It is possible that a Structured Map could be populated with the results of a search [6]. The resulting Structured Map could then be edited, by a knowledgeable person, to drop insignificant or nonsense references or establish additional references or relationships, not found by a search. |
HTML, Hypertext Markup Language ![]() Hyperlink ![]() Structured Map ![]() WWW ![]() | We also see Structured Maps as distinct from links in the WWW. A Structured Map introduces entity types. In the WWW, there is no direct analog of entity types nor entity instances that are described separately from the HTML pages. Another distinction is that all references in a Structured Map, in relationship instances and in references-valued attributes, are typed. This means that we know the reason why the link is present, e.g., to connect to works painted by the painter vs. works that inspired the painter. |
Database ![]() Information Universe ![]() Structured Map ![]() | Structured Maps exhibit some characteristics of a conventional database but also offer additional capabilities, based on the three-level model. The basic premise of this research is to bring the modeling and querying capabilities of a database to a three-level model where the Information Universe includes loosely-structured documents with their own internal/semantic structure. |
1.2 Organization |
Structured Map ![]() | Section 2 of this paper defines Structured Maps and presents the SGML/HyTime foundation for Structured Maps, the Topic Navigation Map Architecture. Section 3 describes our implementation of Structured Maps along with a discussion of issues that affect an implementation. Section 4 includes examples from a large-scale, corporate electronic library . Section 5 evaluates this work by comparing it with related topics in the database and digital library community. Section 6 concludes with a discussion of the contributions of this work and our current research plans. |
Structured Map ![]() | 2. Structured Maps |
Structured Map ![]() | We present the general features of Structured Maps in this section. Figure 3 shows a more detailed version of the example in Figure 2. The primary difference between these two figures is in the references from the entity instances to the information elements in the universe. In the definition of the Structured Map (the ERD part), each entity can have zero or more named, reference-valued attributes and each such attribute can be single- or multi- valued. The simplifying assumptions of Figure 2 were that each entity type had only one, single-valued reference attribute because icons on a road map generally correspond to a single object in the real world. |
Anchor Role Reference ![]() Structured Map ![]() | In Figure 3, the Painter entity type has two, distinct, reference-valued attributes: one to reference biographical information and the other to point to places where the artist is mentioned. The two attributes demonstrate the semantic distinctions that can be highlighted with a Structured Map. We can distinguish the role that the referenced information fills by using distinct reference-valued attributes. Figure 3 also demonstrates that individual references can be at various levels of granularity. Bellini has a "biography" reference that refers to an entire book "The Life and Art of Gentile Bellini" and a "mention" reference to a single paragraph in guide.sgm. Finally, Figure 3 demonstrates that an information element can be referenced by several reference-valued attributes. |
Structured Map ![]() Three-level Model | A Structured Map has a three-level model, as shown in Figures 2 and 3: the Structured Map Definition, the Structured Map Instance (i.e., the populated instance of the Structured Map) analogous to the map, and the underlying universe of information. |
Entity-Relationship Diagram ![]() Information Universe ![]() Structured Map ![]() | Structured Map Definition - The Structured Map Definition follows the normal conventions of an ERD, except that it is limited to entity types and relationships types, e.g., there are no generalization or aggregation links. Each entity type includes an attribute definition that will hold the user-visible title or name for an entity instance. This attribute can be viewed as the label beside the icon on a road map. Each entity type also can define one or more reference- valued attributes. Such attributes hold, as values, addresses of information elements in the Information Universe, and can be single- or multi-valued. The envelope icon shown in the middle layer of Figure 3 is used to represent the address(es) that appear on the reference- valued attributes (in that envelopes have addresses on them). Relationship types can be defined among entity types. |
Structured Map ![]() | Elaborated Structured Map |
![]() | |
| Figure 3 |
Database ![]() Structured Map ![]() | Structured Map Instance - The Structured Map is populated much like a conventional database instance that conforms to its schema. A Structured Map is much less concerned about the domain definitions or representations for (referenced) information compared to a database. The attribute values are either a simple title (often a character string) or references to arbitrarily represented information. |
DTD, Document Type Definition ![]() Document Type Definition ![]() Element ![]() Information Universe ![]() | Information Universe Elements - An information element within a given information source in the Information Universe must be identifiable, addressable, and renderable. As an example, consider an SGML document [7, 8] where each structural element, such as title, abstract, section, or paragraph, is identified through the appropriate tags. Each element is addressable through the SGML ID attribute , provided that an ID attribute is permitted for that element type, as prescribed in the Document Type Definition (DTD) section for that element type. The ID attribute in SGML is reserved to hold the unique identifier (within this SGML document) for this element instance, e.g., this paragraph. Other SGML attributes in this same document may hold this identifier, e.g., to reference the paragraph. The SGML attribute holding the reference is declared to be of type IDREF. Each element is renderable through a variety of SGML-based tools that allow "style sheet" specification to describe the display [9]. SGML ID and IDREFs are not the only possible scheme for referencing document elements. One could use other schemes, such as byte ranges in files or object identifiers in a document database, including the addressing modes of HyTime [10]. |
CApH ![]() Conventions for the Application of HyTime ![]() DTD, Document Type Definition ![]() Entity ![]() Entity-Relationship Diagram ![]() Relationship ![]() Structured Map ![]() Topic Relation ![]() | The modeling capability of a Structured Map is fairly elementary compared to most ERD models. But we are currently guided by the definition of the Topic Navigation Map, defined as part of the working group on the Conventions for the Application of HyTime (CApH). A Topic Navigation Map is represented as an SGML document [2]. The Topic Navigation Map uses the terms Topic, Topic Relation, Topic Title, and anchor role for the analogous terms of Entity, Relationship, Title, and reference-valued attribute in Structured Maps. As an example, the Topic Navigation Map SGML document that corresponds to the Topic Navigation Map of Figure 3 is shown in Figure 4. The Document Type Definition (DTD) for the Topic Navigation Map declares the desired Topic and Topic Relation types (i.e., the Entity and Relationship types, in Structured Map terminology) for the application. The content model for the document instance of the Topic Navigation Map consists of a disjunction of all Topic and Topic Relation types. This means that any number of Topic and Topic Relation instances can appear in any order in the document instance. |
Topic Map ![]() | Topic Navigation Maps depend on the use of several HyTime constructs, particularly those for linking and addressing. HyTime, the Hypermedia Time-Based Structuring Language, is an ISO standard (ISO 10744:1992) [10] that extends the semantics of SGML but is expressed in ordinary SGML syntax. The independent link (ilink) construct of HyTime allows links to be created distinct from the referenced elements; that is, the ilink exists separately from the information elements that are being linked. Both topic instances and topic relation instances are implemented as ilinks. Our motivations for following the definition of Topic Navigation Maps are: |
DTD, Document Type Definition ![]() Entity ![]() Entity-Relationship Model ![]() Relationship ![]() Structured Map ![]() Topic Map ![]() | 1. Topic Navigation Maps are currently being proposed as an ISO standard to provide multi-document indices, glossaries and table of contents [11]. This opens up the possibility that Structured Maps will have an ISO standard interchange format. |
| 4. Topic Navigation Maps use the DTD to describe the structure of the document instance, analogous to the database schema and instance. |
Topic Map ![]() Topic Navigation Maps ![]() | Topic Navigation Maps differ from databases in several important ways. First, there are few constraints on attribute values. As an example, a topic title is viewed as (SGML) content and thus can consist of anything expressible in the declared notation for content. Also, the current definition of Topic Navigation Maps (implicitly) allows a reference on a reference- valued attribute as well as a reference on a topic relation instance to refer to any SGML ID (including an ID for a topic instance, a topic relation instance, or an information item from the universe of information). For example, a Topic instance can refer to other Topic instances or even to Topic Relation instances on a reference-valued attribute. Second, a Topic Relation instance is not constrained to connect topic instances of a prescribed type; they can connect any type of entity instances. As a silly but legal example, painted-bypainter-of could connect two painters. |
CApH ![]() Structured Map ![]() Topic Navigation Maps ![]() | The CApH Committee is currently considering extensions to the current definition of Topic Navigation Maps to provide more structured modeling [11], at least an option. We are currently following a database-style view of the Structured Map model although we leave open the possibility that such rigid structure may not be specified in every Structured Map Definition. Finally, we note that Topic Relations may be of any degree, binary, ternary or higher. One current implementation of Topic Navigation Maps supports only binary relations because they are easy to display for the user [2]. Structured Maps support n-ary relationships, without attributes. |
Database Model Entity ![]() Relationship ![]() Structured Map ![]() | As a database model, Structured Maps can be viewed in several ways. One (overly) simplified view would be to consider it as an Entity-Relationship model where attribute values can be any sort of information item. This view ignores the use of addresses on the reference-valued attributes and de-emphasizes the use of information items in situ. A closely related view is to consider the Structured Map model as an Entity-Relationship model with pointers or reference-valued attributes. We see the indirection provided by addressing as a key feature of Structured Maps. We include it as an explicit part of the conceptual model, as evidenced by our description of Structured Maps as a three-level model. We are particularly interested in the semantics of a query language for Structured Maps. Structured Maps can also be viewed as a binary model where entity instances are represented (only) as an identifier (e.g., an OID) and both relationships and reference-valued attributes are binary relations. Finally, note that the title for an entity could equivalently be represented as a reference-valued attribute where the title(s) are supplied in an additional document in the Information Universe. This approach of referencing an additional document can be used to provide annotation or other documentation for entities in a Structured Map. |
| Independent Link | In the Topic Navigation Map of Figure 4, each Topic Type is declared as a HyTime ilink through the %topic parameter entity; each Topic instance has a Topic title (shown in the content model for each Topic declaration). Each Topic Type declares the names of its anchor roles through the value of the anchrole (SGML) attribute. By convention, the first entry in the anchrole attribute value is the Topic type followed by the names of the anchor roles. The #AGG following an anchor role name means that the anchor role is multi-valued. As an example, the Painter topic type in Figure 4 has an anchor role for "biography" and an anchor role for "mentioned". Each Topic instance has a corresponding set of zero or more addresses for each anchor role. As an example, the Painter instance for Gentile Bellini (with SGML ID = painter-GentileBellini) provides one SGML ID for each anchor role through the values of the linkends (SGML) attribute. This one SGML element (with this ID) consists of the list of addresses. Thus, adr-GentileBellini-mention is the SGML ID for the element that contains the list of addresses, each of which references an information item in the Information Universe. In HyTime, the document identifier is given (in the docorsub SGML attribute). The referenced information element in this case is the information element with the stated ID in the referenced SGML document in the underlying universe of information where Bellini is mentioned. As an example, the Painter instance for Gentile Bellini has the address for the entire "book" SGML document on the linkend for the "biography" anchor role and the ID n9 in the "art" SGML document as well as the ID n39 in the "guide" SGML document on the linkend for the "mentioned" anchor role. The way in which information references can appear is defined as part of the various HyTime addressing modes, including address by name, address by location, etc. |
| ----tm.sgm |
| ...... (standard SGML declaration) ........ |
<painted_bypainter_of linkends =painting-MiraclesoftheRelicoftheCross painter- GentileBellini"> |
</document> |
Topic Map ![]() | Excerpt from a Topic Map SGML Document |
![]() | |
| Figure 4 |
Structured Map ![]() Topic Navigation Maps ![]() | Structured Maps provide an alternative way to express Topic Navigation Maps using the language semantics and technology of database systems. Key features of Topic Navigation Maps that are preserved in Structured Maps are: |
|
Structured Map ![]() | One of the major advantages of Structured Maps, based on the database semantics, is the query capability over the database extension. The challenges of Structured Maps, from a database point of view, are to define the semantics of the structural model as well as the query language for the three-level system and to implement the referenced information elements in a general-purpose manner. |
Structured Map ![]() | 3. Implementation of Structured Maps |
Structured Map ![]() | Our first prototype was designed to show the feasibility of representing the Topic Navigation Map definition and instance in a relational database. |
![]() | |
| Figure 5 gives an overview of the first prototype. We have demonstrated the transfer of information from an SGML Topic Navigation Map document to a database and, in the reverse direction, the transfer of Structured Map instance information back into SGML documents. |
Information Universe ![]() Topic Navigation Maps ![]() Topic Relation ![]() | This first prototype used a commercial product, called EnLIGHTeN [6], as the baseline for the exercise. EnLIGHTeN supports a browser for Topic Navigation Maps over SGML documents. At the heart of the EnLIGHTeN systems is a proprietary HyTime engine called HyMinder [12]. With EnLIGHTeN, the Topic Navigation Map SGML document plus all of the referenced SGML documents in the underlying Information Universe are parsed by an SGML parser and then stored in HyMinder . HyMinder validates the HyTime constructs and then supports their semantics during runtime. As an example, HyMinder supports bi-directional navigation across each Topic or Topic Relation ilink and resolves all addresses that appear on the reference-valued attributes. |
Information Universe ![]() | We issued SQL queries against the Informix database and were able to freely join across relationships. This facility is useful for query answers that consist only of entity titles. Sometimes a query answer included addresses (from the reference-valued attributes) in the form of HyTime addresses referencing the SGML documents from the underlying Information Universe. But there was no direct support in our prototype for interpreting, dereferencing, or displaying the contents of these addresses. We could do matching for equality on addresses but we were not able to navigate to or otherwise interpret the referenced information elements in the underlying SGML documents. (This limitation is in direct contrast to EnLIGHTeN, which provides a fully integrated Topic Navigation Map browser with a built-in SGML browser for the underlying documents, based on the centralized representation of the Topic Navigation Map and the documents in HyMinder.) |
3.1 The CARTE System |
HTML, Hypertext Markup Language ![]() Information Universe ![]() Structured Map ![]() | We considered using HyMinder for a more sophisticated prototype, but opted instead to further explore the use of conventional database technology because HyMinder uses a proprietary data structure and does not focus on query processing and schema management . Our second prototype, called CARTE 1.0 (for Context-Assisted Restructuring via Topical Extensions), also uses Informix as the repository for Structured Map Definition and Instance information. Like its predecessor, CARTE does not store the documents from the underlying Information Universe. One difference in CARTE, compared to our first prototype, is that we used HTML pages as the underlying information sources in the Information Universe rather than SGML documents. We used URL's and the NAME HTML attribute to formulate addresses for the reference-valued attributes in the Structured Map. |
HTML, Hypertext Markup Language ![]() Structured Map ![]() | Figure 6 presents a screen image of the CARTE system. CARTE provides a browser for the database containing a Structured Map. We use the multiple frame capability of Netscape 2.0 to present three different, synchronized frames. Within CARTE, the upper-left frame shows the schema (in OMT notation). The right frame in CARTE displays the instance information from the Structured Map. In Figure 6, the instance frame is showing all of the available entity types, in list form. This is the initial instance screen contents, when the user begins viewing a Structured Map, and has not yet selected an entity type or navigated to a particular entity instance. When the user clicks on one of the entity types, the instance screen shown on the right side of Figure 7 is presented. The entity type is shown followed by the titles of all entity instances of that type. The user can click in this frame to navigate to an individual entity. Such a selection results in the instance screen shown in Figure 8, with one entity title listed with all its available relationships (to navigate from one entity to another) and reference- valued attributes (to navigate to an underlying HTML page). The example shown here is adapted from one developed for EnLIGHTeN [6]. |
| CARTE Screen |
![]() | |
| Figure 6. Structured Map |
HTML, Hypertext Markup Language ![]() | When a user navigates across a relationship, the user sees the entity instance screen for the target instance. When a user navigates to any of the addresses appearing on any of the reference-valued attributes, another Netscape process is invoked to view the underlying, referenced HTML page, as shown in Figure 9. At any time while the second Netscape browser is operational, the user can return to the CARTE interface and proceed with other navigational steps. At present, we do not support inverse links for the reference-valued attributes. That is, it is not possible to navigate from the underlying HTML page "upwards" into the Structured Map from an information element whose address appears in one or more reference-valued attributes. This implementation of CARTE uses the underlying HTML page in situ. The underlying HTML pages have no knowledge of the Structured Map. |
HTML, Hypertext Markup Language ![]() | The lower-left frame in the main CARTE screen is intended to give the user a sense of "you are here" during navigation by describing the current context. The context frame, appends an informational message to a scrolling list each time the user takes a navigational step. Thus, the context frame shows the progression, for example, from an entity instance, across a relationship instance, to another entity instance, down to an underlying HTML page, and so forth. |
| CARTE Screen |
![]() | |
| Figure 7 |
Structured Map ![]() | CARTE provides a navigational browser for Structured Maps. The Informix database for CARTE contains the Structured Map Definition and Instance as well as the addresses appearing on the reference-valued attributes. Each user action taken in the CARTE interface results in an SQL query being issued to the Informix database, followed by the appropriate presentation of information in the CARTE frames. The three CARTE frames are always synchronized with the latest user action. |
| CARTE Screen |
![]() | |
| Figure 8 |
3.2 Discussion |
Information Universe ![]() Structured Map ![]() | A key feature of Structured Maps is the delegation of responsibility for identifying and rendering information from the underlying universe. Structured Maps are distinct from hypermedia database systems [14,15] because the hyperdocument created by a Structured Map is layered over existing information, in situ. It does not require modifications of base documents, such as the insertion of URLs, to create connections among those documents. The bridge between the underlying Information Universe and the Structured Map is the addressing mechanism. Figure 10 summarizes the features of the Topic Navigation Map specification, the EnLIGHTeN product, and our two prototypes. Each aspect is listed in the left column of Figure 10 and discussed below. |
Structured Map ![]() | Any implementation of Structured Maps must deal with the addresses of identified information elements for two different purposes. First, when populating a Structured Map, it is necessary to select information elements of interest and place their addresses on the selected reference-valued attribute for the proper entity instance. Second, for browsing a Structured Map, any traversal of a reference-valued attribute must present the address for interpretation and rendering of the information element. Tools for authoring and browsing may take various approaches for establishing and interpreting addresses. |
Structured Map ![]() | To populate a reference-valued attribute (during Map creation or modification), one possibility is to generate the references using an automatic indexing technique (that might be adjusted by a knowledgeable human user). Another possibility is to mark information elements in the underlying universe with the entity type, entity instance title, and reference- valued attribute name. This approach would allow the establishment of an address on the proper reference-valued attribute for the proper instance; it has been implemented in EnLIGHTeN [6]. Yet another possibility would be to support visual display of both the Structured Map and the information sources with an easy point and click identification of referenced information elements. Finally, it is always possible to create Structured Maps by hand, including the placement of addresses for reference-valued attributes. The support for map creation of the various systems is shown on line 1 of Figure 10. |
HTML, Hypertext Markup Language ![]() | CARTE Screen - Second Netscape Window (to view underlying HTML) |
![]() | |
| Figure 9 |
Structured Map ![]() | A closely related issue, relevant to any browsing or navigational capability is: How will the information be viewed? It is not immediately obvious how various users or application domains might want to see Structured Map Definitions and Instances. For CARTE, we present the schema (i.e., Structured Map Definition) explicitly in the user interface. Perhaps because of our history with working with databases, it seems quite natural for us to show the navigation paths through the schema as well as the instance of the Structured Map. Note that EnLIGHTeN currently does not show the schema to the user. There is also the issue of how to present instance information for the Structured Map Instance. In CARTE, we generate lists of possible traversals, that are appropriate at each navigational step. For an entity instance, we list all available relationship traversals, by type, and all addresses that appear in reference- valued attributes, by attribute name. In general, we can imagine a set of tools that allow an interface designer to freely configure the way in which Structured Map Instance information is displayed. The second line of Figure 10 summarizes the current display choices of the various systems. |
| Choices Concerning Implementation Issues |
| Figure 10 |
Information Universe ![]() | Lines 3 through 5 of Figure 10 indicate the current choice of the various prototypes regarding the method used to address, identify, and render the information elements from the Information Universe. Perhaps the most important point is that the various HyTime addressing modes are standard, by virtue of the fact that HyTime is a standard (ISO 10744:1992) . This standardization enables the delegation of the identification and rendering of information elements via shared addresses. Note also that SGML implicitly provides a way to identify information elements, through markup, and a way to address them by name, through the SGML ID attribute. |
CORBA, Common Object Request Broker Architecture ![]() Structured Map ![]() | The final two lines of Figure 10 deal with the heterogeneity of underlying information and the type of integration between the Structured Map and the underlying information sources. The heterogeneity of information is unlimited, at the conceptual level. Any type of information that can be meaningfully identified, addressed and rendered can participate in a Structured Map. The challenge of heterogeneity has to do with the availability of an addressing mechanism and the connection, at runtime, with the independent technology responsible for the rendering. This second challenge might be addressed through various interoperation models such as CORBA [16 ] and COM/OLE [17]. |
Entity ![]() Entity-Relationship Model ![]() Information Universe ![]() Relationship ![]() Structured Map ![]() WWW ![]() | Finally, the issue of loose or tight integration of the Information Universe with the Structured Map presents all of the classic tradeoffs between a tight or loose federation. A tightly- integrated, centrally-managed repository can offer the advantages of conventional database technology such as concurrency control during update, optimization for access, query optimization, etc. Also, bi-directional links can be more easily maintained, with their integrity ensured. But the scaleability of such systems may be limited. A loose integration offers the advantage of autonomy for the underlying information sources. Such a choice is particularly appropriate for an environment where we have access to information that we do not own, e.g., on the WWW. The use of Structured Maps in this environment provides a rich mechanism for foreground information . A Structured Map can be viewed as a complex bookmark, with the structuring capability of an Entity-Relationship model. |
Digital Library ![]() Structured Map ![]() | 4. Structured Maps in Digital Libraries |
SCEL ![]() Sequent Corporate Electronic Library ![]() Structured Map ![]() | While we have defined a particular formulation of Structured Maps, similar constructs are already in use in digital libraries. In this section, we highlight portions of a particular digital library, the Sequent Corporate Electronic Library (SCEL), that resemble a Structured Map. These portions of SCEL are currently hard-coded into Web pages. One of our research goals is to provide a more automated and flexible means to construct this portion of a corporate digital library. |
SCEL ![]() | Highest Table of Contents Level Screen in SCEL (sketch of actual page) |
![]() | |
| Figure 11 |
HTML, Hypertext Markup Language ![]() Intranet ![]() SCEL ![]() Sequent Corporate Electronic Library ![]() | SCEL is an intranet-based system that provides access to a rich and varied set of corporate information resources to over 2,500 Sequent employees. The system has been operational for about 18 months and the scope and utility of the resource has grown steadily. SCEL is also used to manage routine requests for services through the forms interface of HTML. SCEL is entirely implemented in HTML, relying on Web browsers to provide a uniform, easy-to-use, easy-to-learn interface. But there are a number of striking analogs of Structured Maps present in SCEL. |
SCEL ![]() | One of the top-level pages in SCEL is shown in Figure 11. This screen presents a number of navigational choices; all of the labels shown here are clickable to proceed to more information. SCEL users can view this screen as a visually-displayed table of contents, with entries for Sequent, Suppliers, Offerings, Partners, Channels, and Market. There are also subentries for Sequent, consisting of History and Values, Internal Processes, Organization Charts, Library Employee Services, and Education. The Market entry is further subdivided into Customers and Competitors. |
| If we click on Internal Processes we see the page shown in Figure 12. |
Internal Processes -- How do we do it? |
| * Admin Handbook On-Line |
| This handbook provides administrative assistants with the most up-to-date information to assist them in performing their duties. |
| * Admin Services |
| Business Cards, name Tags and Name Plates, Pagers, Phone Cards, Scheduling Conference Calls, Scheduling Video Conferences |
| * American Operations |
| AO Field IS, Complex Issue Escalation, Courtesy Visits, customer Visits, Executive Briefing Centers, Field Office Guidelines, Major Account Plan |
| * Beaverton Facilities Operations |
| Admin Services, Card Keys, Dry Cleaning and Laundry Services, Facilities Maintenance, Move and Furniture Services, Recycling, Security |
| * Benchmarking Guidelines |
Data ![]() Database ![]() | * Corporate Marketing |
| Complex Systems Consultancy Service, Customer Information Database, ImagePIX Program, Ordering Literature, Photo Library, Poster Program, Price Book, Sales Proposal Builder, System Competency Center Task Creation |
| * Global Marketing Home Page |
| Guide to Partner Management |
SCEL ![]() | Second Level Table of Contents in SCEL (sketch of actual page) |
![]() | |
| Figure 12 |
| This page shows a classical nested table of contents structure, with levels 3 and 4 shown. Levels 1 and 2 of the table of contents is shown in Figure 11. The listed entries in Figure 12 include "Admin Handbook On-Line" and "Admin Services". Each has a number of subentries such as "Business Cards" and "Name Tags" for "Admin Services". Each of these entries leads to either more detailed levels of the table of contents or directly to information sources of various kinds. |
SCEL ![]() Structured Map ![]() | We see this nested table of contents metaphor as a powerful, recurring structure intended to organize access to information. One possible Structured Map Definition capturing the same information is shown in Figure 13. |
![]() | |
| Figure 13. |
| This structure has also been implemented in CARTE. The schema in Figure 13 is simple but it represents the generic nature of a nested hierarchy used as a table-of-contents. The entity title is used to contain the table-of-contents entry name. The has-subentries relationship captures the hierarchical structure of table-of-contents entries, and the reference-valued attribute leads off to the referenced document elements. Although this example uses just one reference-valued attribute, it is possible to have more than one, for different purposes. This structure is actually a bit more general than what is present in SCEL. For example, a table-of-contents entry could lead to a set of references, and these references can be subelements of a document. Such a Structured Map could be visualized in multiple ways, such as the generic interface in Figure 6 or a more specialized interface as used by SCEL. Sophisticated displays as used by SCEL would require additional styling and layout tools. |
Structured Map ![]() | There is another view of Figure 11 that suggest a different metaphor. With the exception of the entries inside the box labeled Sequent, Figure 11 represents different entity types as found in an ERD. There are even some relationships suggested, such as Channel between Offerings and Market. Figure 11 represents an iconic display for a Structured Map Definition. It also demonstrates several choices for visualization. The Sequent entries are placed inside the icon in list form. You could even imagine a scrolling list inside an icon. The other entity types are seen by clicking on the icon. When defined as a Structured Map, the information organization and navigation can reflect the rich structure, perhaps navigating from partners who are also customers or partners or suppliers who also serve as channels. Figure 9 thus suggests some of the potential for introducing entity and relationship types and instances over an underlying set of information sources. |
SCEL ![]() Structured Map ![]() | Note that Figure 11 presents a high-level picture of the value chain for Sequent. A value chain represents a view of the organization highlighting the suppliers and customers, at any level. The value chain was chosen as the highest level view of Sequent, in part, because it provides an easy way to place documents (into SCEL) and an easy way to find them. We are currently exploring the possibility of representing the value chain, in a similar form, at lower levels in the Sequent organization and reflecting it in a Structured Map. |
5. Related Work and Evaluation |
Entity-Relationship Model ![]() Structured Map ![]() Topic Map ![]() Topic Navigation Maps ![]() | The most closely related work is clearly the definition of Topic Navigation Maps, based on HyTime and SGML. The Topic Navigation Map provides the conceptual foundation for Structured Maps. The contribution of the Topic Navigation Map is the emerging ISO standard, to provide a precise syntax and an interchange format for Structured Maps. The semantics of Topic Navigation Maps derive from SGML and HyTime in that the choice of certain SGML or HyTime elements imply certain semantics. The use of database technology to support Structured Maps brings a long tradition of ERD modeling, clear semantics for collections of instances, a well-defined query languages with query optimization techniques. It also opens up the possibility of managing the updates to a Structured Map using DBMS support. |
Federated Database Structured Map ![]() | One related area of work in the database field is federated databases, where autonomous databases can be viewed conceptually as a single database with a single (integrated) schema. [18,19]. In the case where each participant underlying information sources is a database with a schema and where the Structured Map Definition corresponds to a global schema, then a Structured Map is a federated database. Our work focuses on three aspects of Structured Maps that are outside of the main focus of federated database research: (1) the introduction of more than one Structured Map, where the entities and relationships type have not necessarily been captured in that form elsewhere, (2) the use of Structured Maps over loosely structured information sources such as documents, spreadsheets or video, and (3) the use of Structured Maps over in situ information where an addressing mechanism bridges the two different implementation environments. |
Entity ![]() Entity-Relationship Model ![]() Information Management Intelligent Integration of Information Architecture Relationship ![]() Structured Map ![]() | Another area related to this work is the Intelligent Integration of Information (I3) Architecture with intelligent mediators introduced to facilitate interaction of various information sources and services [20]. The I3 program is much more ambitious in its goals and in its techniques, employing intelligent agents to analyze information sources, for example. Structured Maps adopt a simplified Entity-Relationship model, with the focus on explicit representation of information and connections, much like traditional database systems. Structured Maps could conceivably be exploited in the I3 architecture because of the additional semantics they provide. Finally, Structured Maps focus on autonomous systems for information management, but with a simple, well-defined interface to identify, address, and render information. |
Database View Structured Map ![]() | Structured Maps provide a type of database view, a materialized view [21]. They differ from conventional views because they may introduce entities and relationships that are not present explicitly in any of the underlying information sources. Another difference is that a Structured Map presents a "view" in the form of a database. So it is somewhat similar to database-structured query answers [22]. |
| Within the digital library research community, e.g., [23, 24], there is little focus explicitly on the conceptual model for information. Much work has focused on various aspects of searching such as the user interface and the performance. Some work has focused on spatial modeling, representation and searching, e.g., [25, 26, 27]. |
DTD, Document Type Definition ![]() Information Universe ![]() SGML Database ![]() Structured Map ![]() | Another related area of research is efforts to store SGML documents directly in a database, capturing the DTD structure explicitly in the schema, e.g., [28]. Such work is complementary to this research; we do not consider the modeling or the representation of SGML documents. An SGML database could offer an appropriate repository for the SGML items in the Information Universe and could support a centralized approach to Structured Maps. SGML database do not explicitly set forth a three-level model such as Structured Maps. Another research effort deals with non-SGML structured text in a digital library [29]] . 6. Conclusions and Future Work |
Entity-Relationship Diagram ![]() Structured Map ![]() Topic Map ![]() | Structured Maps provide a new level of modeling that can be introduced over various information sources. The form of the model, an ERD, is familiar and ubiquitous. This similarity suggests that the utility of the model and the application of conventional database tools and technology are both likely to be high. We focus in our work on the implications of the three-layer model. Work in progress is considering the semantics of a query language for Structured Maps. A particular issue is how to provide support for queries that span the Structured Map and the underlying document content. Other challenges include the interaction among multiple Structured Maps as well as the interface between a Structured Map and an underlying database schema. Both of these topics are related to schema integration but may differ in the details because of the simplicity of Topic Maps, essentially without conventional attributes and without concern for the representational details of the underlying information. This avoidance of representational focus (and the accompanying reduction in representational conflict) is facilitated by the delegation of the responsibility for identification, addressing and rendering. |
Information Universe ![]() Topic Map ![]() | Topic Navigation Maps are defined as an SGML document which serves as a standard data interchange format. But there is no specification of any tools or uses of Topic Navigation Maps in the standard. There are no operational semantics nor any prescribed usage model for Topic Navigation Maps. EnLIGHTeN has pursued one usage model: browsing a Topic Navigation Map along with browsing the information sources in the Information Universe in one integrated tool. In this research we are pursuing a slightly different usage model where the interpretation and browsing of information sources is in separate tools and where database-style queries are supported, in addition to browsing. |
DDL Information Universe ![]() Querying ![]() Structured Map ![]() Topic Map ![]() | The language associated with Structured Maps is quite non-traditional. The foundation for Structured Maps, the Topic Navigation Map SGML document, provides a standard syntax for representing the DDL, instances, and addresses for reference-valued attributes, albeit buried in the cumbersome syntax of SGML. But it does provide a standard data interchange format that benefits fro |