[topicmapmail] Classification of occurrences using keywords
Lars Marius Garshol
larsga@garshol.priv.no
11 Nov 2002 21:01:16 +0100
* Lars Marius Garshol
|
| What's the difference, in general, between a topic and a keyword?
* Murray Altheim
|
| Big difference. Huge difference, search-wise. I think you and
| probably agree about "topic" and "subject", whereas a set of
| keywords are used to establish search criteria.
Well, don't topics do the same?
| So, searching for say a paper on "Navigable History" (a subject) we
| might use the keywords:
|
| event history, navigable history, constructive time, edit-based
| indexing, information workspace, analysis, interpretation, authoring,
| spatial hypertext
OK, but couldn't each keyword be modelled as a topic in its own right?
Wouldn't that allow you do say far more about each topic?
| I mentioned that keywords essentially are a deconstruction or
| decomposition of a topic.
Are they? I think we have to be careful about how we use terminology
here. One thing is keywords attached to documents as in old-fashioned
metadata, another is keywords in topic maps. I think keywords have no
place in topic maps and would be curious to know what you think they
are useful for and how you would model them.
| Just to test this theory, I grabbed those keywords from a specific
| paper by Frank M. Shipman and Haowei Hsieh. I can take those
| keywords and paste them into Google and find the paper. [goes off
| and tries it] Damn, but it works!
Sure, but it would work in topic maps, too.
| Now, it'd be hard to argue that "analysis" (or really, any of the
| above keywords) matches the subject "Navigable History: A Reader's
| View of Writing".
Of course, but each of those keywords you listed is a concept that can
be represented by a topic, and by creating meaningful associations you
can relate those topics to that paper in a way that makes the
information enormously much easier to find, and also provides you with
other ways to use the information than a simple keyword search.
* Murray Altheim
|
| Using existing keyword models with topic maps can be very powerful.
OK, but the question is how you would model that inside the topic map.
| I see James has posted some of his stuff on Ferret, which I think is
| pretty cool.
It is, but he actually does model keywords as topics...
| I was trying to come up with a methodology for using existing
| keyword schemes with topic maps. Of the several hundred research
| papers I've got in my Ph.D. library, most all have lists of
| keywords. The reason I chimed in is because I've been trying to
| figure the best way to take advantage of those keywords in a topic
| map framework.
In my experience the best way is to turn those keywords into topics
and then to build a meaningful ontology that represents the
relationships between the concepts for which the keywords are names.
To take a more concrete example, rather than to say
Title: Karl Marx: A life
Author: Francis Wheen
Keywords: Karl Marx, marxism, Friedrich Engels, Tussy Marx,
communism, Das Kapital, ...heaps of things mentioned
in the book...
it is much better to say:
[karl-marx : person = "Karl Marx"]
{karl-marx, biography, "urn:isbn:039304923X"}
[marxism : ideology = "Marxism"]
created-by(marxism : creation, karl-marx : creator)
[communism : ideology = "Communism"]
based-on(marxism : basis, communism : derivative)
[friedrich-engels : person = "Friedrich Engels"]
friend-of(friedrich-engels : friend, karl-marx : friend)
/* and so on, and so forth ... */
This provides enormously much more information about each of the
concepts involved and also avoids connecting the book with communism
when, in fact, the book is not about communism. If you want
information on Marx's influence on communism, of course it makes sense
to look for a biography of Marx as well, but with this system you
actually know how to find that.
With keywords (and no topic maps) you just find a mess.
* Lars Marius Garshol
|
| Steve Pepper and I are presenting a paper at XML 2002 about how we did
| what essentially amounts to automated keyword mining in a large data
| set (~1000 documents) using topic maps, and *without* relying on
| keyword metadata fields.
* Murray Altheim
|
| Could you point us at it? I'd be very interested in reading it. I
| agree that we can hope to surpass existing methodologies, but I
| think you and Steve going to the trouble of keyword mining is
| basically the same problem we've been discussing. You're both using
| keywords, just in a new way.
Yeah, but the key point is that when you do that keywords are no
longer modelled as being keywords inside the topic map. They are just
topics like any other.
I guess it's OK to give you the URI to it:
<URL: http://www.ontopia.net/topicmaps/materials/xmlconf.html >
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC <URL: http://www.garshol.priv.no >