[topicmapmail] Topic Map Design Patterns
Murray Altheim
murray06 at altheim.com
Mon Apr 10 19:01:35 EDT 2006
Quoting "Penichet, Juan" <juan.penichet at ttu.edu>:
>
> Hi everyone,
>
> I recently created an ontology using one of Kal Ahmed's Topic Map
> Design Patterns; the Hierarchical Classification Pattern. But I was
> wondering if anyone knows if there is anymore research in Topic Maps
> Design Patterns other than the proposed by Kal in "Topic Map Design
> Patterns For Information Architecture."
>
> Also, I'm trying to find out about ontology evaluation metrics. A way
> to compare different ontologies based in different design patterns. I
> know that the difference in between two topic maps developed by two
> different people can be compared to the analogy of source code
> programmed by two different programmers. But then again, how can
> somebody be sure if the Topic Map Design Pattern implemented by one
> ontology is better or worse than using a different pattern.
Juan,
The idea of Design Patterns is an interesting one, but patterns are
by their nature a limiting filter in ontological analysis. If a given
ontology only uses one relation type, or one is using a filter to
only view a single type from a wider ontology, the overall meaning of
the ontology is distorted -- you're no longer seeing what's being
expressed accurately, and indeed the picture one gets might be even
at odds with what is actually being expressed. It's a bit like taking
words randomly out of a sentence, or more correctly, sentences out of
a paragraph.
By the same token, evaluation metrics for ontologies are also a filter
on the expression, and I'm not sure what meaning you'd be trying to
derive from such metrics. If you think about how one might evaluate
a chapter of a book there'd be many levels of analysis, such as
narrative structure, grammar, syntax, lexicon, word and character-
level, but the most important to most people would likely be meaning,
which is something that no computer has yet been effective at
analysing. Yes, computational linguistics has provided means of
identifying word frequencies and positional relations, and from that,
inferring subjects, but these are (after decades of research) still
relatively primitive techniques and don't take into account things
like idiom, metaphor, nor the reasoning behind a text (and most were
designed to operate in English).
For instance, if I mention the word "terrorist" this email message
will be picked up by theoretically-intelligent software harvesting
tools run by the US government, despite the fact that this message
has nothing to do with terrorists nor with bombs nor Al Qaeda (there,
I've reinforced the mistaken inference). Other mistakes include:
* if I'd misspelled al-qaida, al qa;ida, Al Quaidaa;
* if I'd written it in a language that the tool wasn't programmed
to capture or wasn't filtering on (e.g., Kazakh, Lithuanian or
Navaho), the filter wouldn't be able to infer any subject from
that content;
* or perhaps I was using "al qaeda" in a sentence but using it
with its original meaning ("the source");
* or using some other term as a code word for Al Qaeda;
* or if I had used the Kazakh word for "terrorist", how would the
software know that the word I'd used was Kazakh and not a
homograph of a word from some other language? People don't
usually label the language of included foreign phrases.
By now, any filtering tool might infer that I am a Kazakhstan-based
Al Qaeda terrorist, or that I'm writing about one, or at least that
this message bore further security evaluation -- which it doesn't.
In any case, none of this has anything to do with this email message,
which is more to the point.
Humans very commonly use simile, metaphor, and examples from
unrelated domains to make a point, a statement. Any analysis would
have to be able to decipher the complexity of the way humans express
themselves. Now I know that a lot of people in the KR field think
their ontologies escape these problems, that they are perhaps simpler
or only express fixed, Platonic meanings from fixed, Platonic domains.
I couldn't disagree more. The simplest of statements are not clear,
only more ambiguous. What might seem a simple genealogy or taxonomy
is really a masking of a greater reality.
In this sense, the commonly-accepted zoolological taxonomy has for
at least a century been hiding the reality that we're now seeing
exposed via DNA analysis (e.g., two birds that we thought were
related are not; two fish that we thought were very different are
by DNA closely related), but even DNA is only the expression of yet
one more Design Pattern in nature -- it doesn't tell the whole
picture either -- it's just a different picture. And yet even after
these face-slapping lessons you'd be hard-pressed to find mention
of holism in either zoological taxonomy or ontological engineering.
So when you suggest ontological metrics, what are you really trying
to accomplish? If you look at your project as very similar to what
you'd try to analyse in conducting it on a book chapter, which is
composed of paragraphs, which are composed of sentences, then the
structures of ontology are very similar: a sequence of related
statements, with structures in both the content and the relations.
How you would design tools to analyse a book's chapter would
therefore be very similar to how you'd go about designing tools
for the ontology -- you'd have to select the types of relationships
you wanted to analyse -- but in the end it would be the choice of
the level at which the analysis would occur: ideally, at the
conceptual level, which is where we begin to be able to infer
meaning via interpretation of the expressed concepts, not the
expressed words.
And beyond the chapter analysis there's really the most important
part, which is how that chapter fit into its parent book. In the
same way, how an ontology fits into a wider world view is a
fundamentally important (and almost always neglected) analysis.
Comparing ontologies would therefore be like comparing chapters
from different books. You'd have to programmatically conduct
whatever analysis is necessary to determine the various metrics
you were trying to report (average word length, word relations
between chapters, etc.), but in the end you'd be comparing
chapters from two different books, written perhaps for entirely
two different purposes (even if purportedly on the same subject).
Is this to be taken into account in the analysis? How? Is this
*really* what you're trying to accomplish?
In the end, ontologies are simply another form of human
expression, and bear the same issues of interpretation and
misinterpretation, which is essentially what Wittgenstein
was saying back in 1946. There is no escaping the stains of
epistemology.
If you haven't checked out John Sowa's "Knowledge Representation"
I would recommend it, as well as his online "Knowledge Soup" [1].
And prior to John's work I'd highly recommend Bill Kent's "Data
and Reality" [2], which shines a big bright light on most of the
mistakes of the "Semantic Web", but did so about 20 years prior
to its visionary inception. Too bad a few more people hadn't
cracked its cover before having their erowidian visions. There
might be fewer people spending their time conducting research
in the field, but that wouldn't be much of a loss, as they might
instead be doing productive work elsewhere...
Hope this was more helpful than confusing. (my burst from the
morning's coffee)
Murray
[1] http://www.jfsowa.com/talks/souprepr.htm
"Knowledge Soup" is chapter 6 of "Knowledge Representation":
http://www.allbusiness.com/periodicals/article/237239-1.html
[2] http://mysite.verizon.net/ambur/datareality.htm
http://www.authorhouse.com/BookStore/ItemDetail.aspx?bookid=2713
...........................................................................
Murray Altheim <murray06 at altheim.com> === = =
http://www.altheim.com/murray/ = = ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk = = = =
In the evening
The rice leaves in the garden
Rustle in the autumn wind
That blows through my reed hut. -- Minamoto no Tsunenobu
More information about the topicmapmail
mailing list