[topicmapmail] Dictionary
Lars Marius Garshol
larsga@garshol.priv.no
05 Mar 2002 11:16:26 +0100
Hi Bob,
* Lars Marius Garshol
|
| Are you asking for advice on how to turn your dictionary into a
| topic map? How to represent it as a topic map? How to implement
| searching in it once it has become a topic map? Or how to integrate
| the dictionary with other data that is in topic map form?
* Bob Parks
|
| The answer is "yes" ... ;-) We would like to do all those
| things. The most important are (1) representing the dictionary as a
| topic map; and
This need not be very difficult, actually. Judging from your web site
you could turn each headword into a topic, and each different meaning
of that headword into a topic. You could then use associations to
connect the the meaning with the headword. The "part of speech"
information could be captured by making the headwords instances of
"noun", "adjective", etc, all of which would be subtypes of "word".
Phrases would also merit topics of their own.
Definitions, pronunciations, and examples would become internal
occurrences of those types. (That is: "definition", "pronunciation",
and "example".)
The links between different headwords you could capture with
associations of different types.
Probably there is a bit more to your information, but it should be
easy to represent it as a topic map.
At one point I converted WordNet into a topic map. This was a very
direct translation of the WordNet structure using our autogeneration
toolkit, but if you're interested I could dig it up and make the topic
map available to you.
Your web site could actually quite easily be reimplemented on top of a
topic map using a topic map web publishing solution and a full-text
search integration. It wouldn't give you much that you don't already
seem to have, though, unless you want to expand your service with more
content.
| The basic perspective behind my interest is to use the dictionary
| definitions/concepts as a "starter kit" for indexers or creators of
| topic maps, where the concepts/topics they index may or may not be
| represented in the dictionary.
Then I understand where you are headed. Yes, this makes a lot of
sense. Essentially, what you want to do is to create a set of
so-called published subjects. These are well-defined topics with
stable identifiers, which can be reused by other topic map creators to
a) save ontology design time, and b) ensure that their topic maps are
compatible with those created by others.
The stable identifiers ensure that if two different topic map creators
both create a topic for the subject "head" (the body part), they can
both refer to your identifier for that subject (your published
subject, essentially), and their topic maps will merge correctly.
You may be interested to learn that there is an OASIS technical
committee that is working on creating general guidelines for how to
publish sets of published subjects. The work of this TC might make
your own work a lot easier.
You can find a description of the TC, as well as some of their
work-in-progress on the page below. Note that you can also quite
easily join the TC, should you want to.
<URL: http://www.oasis-open.org/committees/tm-pubsubj/index.shtml >
| Hope this helps.... though I may only have communicated my lack of
| understanding.
Definitely not. You seem very well clued-in to me.
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC <URL: http://www.garshol.priv.no >