[topicmapmail] The potential of TM fragments
Lars Marius Garshol
larsga@garshol.priv.no
19 Oct 2003 17:34:18 +0200
* Carlo Moneti
|
| I am new to topic maps and new to this list. I hope this question is
| appropriate:
Absolutely.
| RDF is stored in each document.
Not necessarily. It can be, but I'd say that is quite unusual in
practice, and when you use the information you will have to pull it
out of the documents.
| TM is stored in one place and describes many documents (or is just a
| template of generic topics and their interrelations with yet
| unconnected content). RDF can be aggregated in one place as well.
Yep.
| It seems self-evident that standalone documents (which they would be
| at time of creation) would benefit from included metadata so that
| their metadata isn't lost when they get moved around.
Yes and no. There are usually many considerations here. Does the
format support structured metadata? Does it support the right
metadata, the kinds you actually need? Do you have the tools to
support editing, validation, and maintenance of the metadata inside
the documents?
What you *really* want is a proper metadata management process, and if
you have that it doesn't necessarily matter all that much whether the
metadata is in the documents or outside. In fact, it's usually easier
to keep them outside since that means you can treat all formats the
same.
| This led me to wonder, what type of TM data can intelligently be
| embedded in a document?
Depends on the document format. Note that you don't necessarily have
to use the XTM syntax. You can use conventions for mapping the
metadata to topic maps.
| I imagine a topic defining itself, consisting of strictly internal
| references plus references to PSIs, would make sense as a standalone
| TM fragment. However, after harvesting this metadata from, say, 1000
| such documents, you'll still have a lot of work ahead to define the
| useful associations among the documents; you can't define
| associations until you have aggregated some topics and have
| formulated the relevant association-types.
Well, you can, if a document has a way to identify a topic so that the
aggregator knows when two different documents are referring to the
same thing. There are lots of ways to do this.
| My question is, is there a way to define TM fragments in documents
| so that when harvesting them, a rich TM can be automatically
| generated?
Yes, there is.
| In trying to answer my own question, this is what I came up with:
|
| 1. if there exists a rich set of PSIs,
| 2. if the fragments use PSIs everywhere,
| 3. if the documents are of the same knowledge domain,
| 4. if you already have a TM template that defines all of the topic-types,
| association-types, and the associations among those topic-types for that
| domain,
| then, it seems you would have all the bits of information necessary
| to process the harvested fragments and the TM template into a rich
| map. Is this roughly correct?
Yes, it is, but you can also do it without most of this. Point 4 you
absolutely need, but what you *actually* need is a way to identify
topics. PSIs are one way, but simple codes and names can also do the
trick, so long as you can convert those to proper identities.
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >