[topicmapmail] topicmaps for bibliographic databases?

Guenther Neher g.neher@gmx.de
Sat, 13 Apr 2002 16:48:43 +0200


Hello,

this is my first mailing to this group and its a bit lengthy,
sorry for that, this won't become the rule :-)

(you may skip the following paragraph, it's just to
give a background for my questions below):

<background>
i'm working for a small faculty of library and 
information science. 

we are working for quite some time now on user-interfaces 
to information retrieval systems (esp. bibliographic databases)
we use clustering-techniques (based on document similarity)
to automatically generate on-the-fly:

(a) context-related terms for query refinement
    (as the search-engines http://www.northernlight.com 
     or http://www.teoma.com do for example)

(b) thematically structured "information spaces" in 3D, 
    the user can fly through to explore the search results
    (by the way: our experience is, that 3D navigation is a
     real challenge for most users :-))

Our "structuring"-algorithm is purely statistical in nature
and based on hand-crafted document descriptions (as typical 
for bibliographic databases)

What we now see and follow with much interest, are the
emerging technologies (topic maps and RDF/DAML+OIL) evolving in 
the context of the "semantic web".

We think that the standards developed in this field may also 
have great impact on "classic" information retrieval systems
and information providers like libraries and prodcucers of 
bibliographic databases.

There are 2 aspects we think are especially interesting for our field
(user interfaces to information retrieval systems):

(1) the vision of progressing from "boolean retrieval" to "ontology-based
retrieval"

(2) the vision of being prepared (by using XML-based coding-standards) to
"tie" 
    the knowlegde stored in our database to knowlegde stored in databases 
    elsewhere. here i have in mind the (probably naiv) picture of tieing two
    ontologies together via one term that means the same thing in
    both (documented by pointing to the same PSI or URI) and thus using
    that term as a "gateway" from one "knowledge-store" to another.

</background>

OK, and now (finally) my questions :-)

(1) Do you know of any running projects out there, working
    on the question, IF and HOW semantic web technologies
    could be used to improve access to bibliographic databases ?

(2) quite concrete to get an idea of "best practices" in
    implementing topic maps for our purposes:

Assume we have 2 researchers P1,P2 working for institutions
I1,I2 and having published papers Pub1,Pub2. As citation
is an important relation within the scientific community
assume we had defined an association 'cites' between Pub1
and Pub2. Assume further that we were even able to detect
the context, in which the citation is made (say by analyzing
the sentence where the citation occurs, e.g.
"... As X has shown... [Pub2].". (I tried to visualize
the situation by ascii-art and hope it renders correctly)


<Institution>      <Researcher>       <Publication>---|
  |    |              |    |             |            |
 isa  isa            isa  isa           isa          isa
  |    |              |    |             |            |
<I1> <I2>             |    |             |            |
  |    |__works_for__<P2>___author_of__<Pub2>         |
  |                        |             |__cites___  |
  |                        |                   ^    | |
  |______works_for________<P1>____author_______|____<Pub1>
                                               |
                                               |
                                       (context of citation) ???




Now the question: 
HOW should a 'context'-topic be tied to the association 'cites' ?
(assume the context topic is a term, taken from a controlled vocabulary
(e.g. thesaurus))

My first idea was to use <scope> but the XTM-examples i found suggest,
that the number of different scopes within a topicmap tend to be rather
small (e.g. scoping by 2 or 3 different languages, or scoping by
2 or 3 learning-levels (beginner, professional).
In our case 'context' could be 1 of over 1000 thesaurus-terms.

Does that make sense - having 1000 scopes or more ?
How would you do?


Thanks for reading

Regards

Guenther Neher

P.S.
If you reply outside this mailinglist please use the following address:

      neher@fh-potsdam.de