[topicmapmail] Classification of occurrences using keywords

Thomas B. Passin tpassin@comcast.net
Mon, 11 Nov 2002 20:20:20 -0500


[Murray Altheim]
[About using keywords to classify topics]
> I've been thinking about this question for the past week or so, and
> am thinking there may be another way that would take advantage of the
> topic naming features of topic maps. This depends to some extent on
> how the keywords are structured as generally a topic is described by
> a *set* of keywords such that any one of its keywords incompletely
> describes the topic, i.e., keywords are a decomposition of a subject.
>

This area can be pretty sticky.  For one thing, once there get to be too
many keywords, it becomes really hard for a person to work with them and
they need to be organized in some way.  Second, one's notion of the right
keyword to use changes over time.  Third, surely it ought to be possible to
apply some classification to the keywords themselves.

My canonical example for this is a set of browser bookmarks.

Perhaps it comes down to your notion of what a keyword really represents.  A
purist might say that a keyword represents a class-subclass relationship, so
why not just use them and be done with it?  A pragmatist might say that it
is easier to work with the occurrences when listing topics than with
associations, and anyway, clasification is hard and keywords are easier.

To confound the matter, you might look at keywords as representing facets,
and thus more-or-less orthogonal to class-subclass matters.

I have tried keywords with browser bookmarks, and as my collection grew they
became unmanageable.  This was in my pre-topic-maps era.  I have tried using
class-subclass associations with topic maps and was much happier, but I am
still not convinced I have them worked out to best advantage.

The key question seems to be this - should bookmark folders and bookmarks
actually be types of the categories they deal with, or should they just be
folders and bookmarks that happen to be associated with those categories?

For example, should a folder that holds bookmarks about cats be a subclass
of "Cat" (or better, "Cat-knowledge", since obviously it is not a cat), or
should it be a subclass of "Folder", a subclass called perhaps "Cat-folder)?

As you can see, I am still wrestling with this, but one thing I am clear
about is that flat, unmanaged keywords lead to trouble if your topic map
gets to be any size.  For small-enough maps, they can work well.  Of course,
if you use occurrences to hold the keywords, you can classify them to your
heart's content.  While I am mulling this over, my current system has only
folders and bookmarks, with the bookmarks playing a "content" role when
associated to folders.  There are no keywords or further classification so
far.  I can generate the topic map this way from an XBEL bookmarks file
using an xslt stylesheet.

>
> As I said, these are just some ponderings, but it seems there might
> be some way of using topic maps' native subject handling features to
> their best extent, rather than manually adding occurrences. The whole
> idea of the Semantic Web is predicated on some machine intelligence,
> so it seems a shame to rely so heavily on existing ways of using
> keywords. There's got to be a better way!
>

Yes!

Cheers,

Tom P