[topicmapmail] Classification of occurrences using keywords
Murray Altheim
m.altheim@open.ac.uk
Mon, 11 Nov 2002 17:16:49 +0000
Jan Algermissen wrote:
> The following message was sent by Johannes Koppenwallner
> <a9405411@unet.univie.ac.at> on Mon, 11 Nov 2002 17:08:11 +0100.
>
>>Hello,
>>I have a topic map with already defined topics and associations. Now I
>>want to populate this map automatically with occurrences, which should
>>be classified to the topics using keywords. (I know, a quite primitive
>>way to classify.) To make this work, I have defined keywords for every
>>existing topic. It would be nice to define the keywords/topic relation
>>in the topicmap, so nothing else than the topic map is needed to
>>classify documents. So my question is: What's the best way to model the
>>relation between a topic and it's keywords?
>
> Johannes,
>
> I would use a resource data occurrence for this purpose:
>
> <topic id="t1">
> <basename>
> <baseNameString>Automobiles</baseNameString>
> </baseName>
> <occurrence>
> <instanceOf>
> <topicRef xlink:href="#keywords" />
> </instanceOf>
> <resourceData>
> tires; windscreen; gears; trunk; engine;
> </resourcedata>
> </occurrence>
> </topic>
I've been thinking about this question for the past week or so, and
am thinking there may be another way that would take advantage of the
topic naming features of topic maps. This depends to some extent on
how the keywords are structured as generally a topic is described by
a *set* of keywords such that any one of its keywords incompletely
describes the topic, i.e., keywords are a decomposition of a subject.
If one were to put each keyword in as a separate base name, or perhaps
as a base name variant, if things were properly scoped it might be
possible to auto-merge and generate associations between topics. Then
occurrences which were identified as matching some preordained limit
of keyword matches could be attached to their respective topics. In
a sense, you'd build a "temporary match" topic map based on how well
an occurrence matched a specific subject, eliminate less-than-acceptable
associations, then use those that survive to populate the permanent
map.
As I said, these are just some ponderings, but it seems there might
be some way of using topic maps' native subject handling features to
their best extent, rather than manually adding occurrences. The whole
idea of the Semantic Web is predicated on some machine intelligence,
so it seems a shame to rely so heavily on existing ways of using
keywords. There's got to be a better way!
Murray
......................................................................
Murray Altheim <http://kmi.open.ac.uk/people/murray/>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK
If you're the first person in a new territory,
you're likely to get shot at.
-- ma