[topicmapmail] Classification of occurrences using keywords
Lars Marius Garshol
larsga@garshol.priv.no
11 Nov 2002 19:33:13 +0100
* Murray Altheim
|
| This depends to some extent on how the keywords are structured as
| generally a topic is described by a *set* of keywords such that any
| one of its keywords incompletely describes the topic, i.e., keywords
| are a decomposition of a subject.
What's the difference, in general, between a topic and a keyword?
In my opinion, topic maps were created in order to give us something
much more powerful than the keyword model.
| If one were to put each keyword in as a separate base name, or
| perhaps as a base name variant, if things were properly scoped it
| might be possible to auto-merge and generate associations between
| topics. Then occurrences which were identified as matching some
| preordained limit of keyword matches could be attached to their
| respective topics. In a sense, you'd build a "temporary match" topic
| map based on how well an occurrence matched a specific subject,
| eliminate less-than-acceptable associations, then use those that
| survive to populate the permanent map.
Hmmm. I'm not sure I follow you here. What would the purpose of this
be? What I mean to say is: in what sorts of situations would you take
this approach, and why?
| As I said, these are just some ponderings, but it seems there might
| be some way of using topic maps' native subject handling features to
| their best extent, rather than manually adding occurrences.
Steve Pepper and I are presenting a paper at XML 2002 about how we did
what essentially amounts to automated keyword mining in a large data
set (~1000 documents) using topic maps, and *without* relying on
keyword metadata fields.
We are very pleased with the result, and feel that this shows very
clearly how topic maps can go far beyond anything that the old keyword
model could ever hope to support.
| The whole idea of the Semantic Web is predicated on some machine
| intelligence, so it seems a shame to rely so heavily on existing
| ways of using keywords. There's got to be a better way!
Agreed.
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC <URL: http://www.garshol.priv.no >