[topicmapmail] Classification of occurrences using keywords

Murray Altheim m.altheim@open.ac.uk
Mon, 11 Nov 2002 19:33:53 +0000


Lars Marius Garshol wrote:

* Murray Altheim
| | This depends to some extent on how the keywords are structured as
| generally a topic is described by a *set* of keywords such that any
| one of its keywords incompletely describes the topic, i.e., keywords
| are a decomposition of a subject.

What's the difference, in general, between a topic and a keyword?

Big difference. Huge difference, search-wise. I think you and probably
agree about "topic" and "subject", whereas a set of keywords are used
to establish search criteria. So, searching for say a paper on "Navigable
History" (a subject) we might use the keywords:

    event history, navigable history, constructive time, edit-based
    indexing, information workspace, analysis, interpretation, authoring,
    spatial hypertext

I mentioned that keywords essentially are a deconstruction or decompo-
sition of a topic.

Just to test this theory, I grabbed those keywords from a specific paper
by Frank M. Shipman and Haowei Hsieh. I can take those keywords and paste
them into Google and find the paper. [goes off and tries it] Damn, but
it works!

Now, it'd be hard to argue that "analysis" (or really, any of the
above keywords) matches the subject "Navigable History: A Reader's
View of Writing".

In my opinion, topic maps were created in order to give us something
much more powerful than the keyword model.

Absolutely. But no reason to throw out the existing baby when she
can be used to help the new baby. Using existing keyword models with
topic maps can be very powerful. I see James has posted some of his
stuff on Ferret, which I think is pretty cool.

| If one were to put each keyword in as a separate base name, or
| perhaps as a base name variant, if things were properly scoped it
| might be possible to auto-merge and generate associations between
| topics. Then occurrences which were identified as matching some
| preordained limit of keyword matches could be attached to their
| respective topics. In a sense, you'd build a "temporary match" topic
| map based on how well an occurrence matched a specific subject,
| eliminate less-than-acceptable associations, then use those that
| survive to populate the permanent map.

Hmmm. I'm not sure I follow you here. What would the purpose of this
be? What I mean to say is: in what sorts of situations would you take
this approach, and why?

I was trying to come up with a methodology for using existing keyword
schemes with topic maps. Of the several hundred research papers I've
got in my Ph.D. library, most all have lists of keywords. The reason
I chimed in is because I've been trying to figure the best way to take
advantage of those keywords in a topic map framework.

| As I said, these are just some ponderings, but it seems there might
| be some way of using topic maps' native subject handling features to
| their best extent, rather than manually adding occurrences.

Steve Pepper and I are presenting a paper at XML 2002 about how we did
what essentially amounts to automated keyword mining in a large data
set (~1000 documents) using topic maps, and *without* relying on
keyword metadata fields.

We are very pleased with the result, and feel that this shows very
clearly how topic maps can go far beyond anything that the old keyword
model could ever hope to support.

Could you point us at it? I'd be very interested in reading it. I agree
that we can hope to surpass existing methodologies, but I think you
and Steve going to the trouble of keyword mining is basically the
same problem we've been discussing. You're both using keywords, just
in a new way.

| The whole idea of the Semantic Web is predicated on some machine
| intelligence, so it seems a shame to rely so heavily on existing
| ways of using keywords. There's got to be a better way!

Agreed.

Murray

......................................................................
Murray Altheim                  <http://kmi.open.ac.uk/people/murray/>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK

            If you're the first person in a new territory,
            you're likely to get shot at.
                                                     -- ma