[topicmapmail] Two Models of Facets

Murray Altheim m.altheim@open.ac.uk
Tue, 18 Nov 2003 21:33:54 +0000


Two Models of Facets
--------------------

I've been investigating various ideas about facets, 'facets' defined
in the general sense. There's probably at least a half dozen different
ideas about what facets are (without getting into gemstones), and they
aren't really in agreement, though certainly run around the same basic
concept of small things attached to big things, "metadata" attached to
"data", etc. There's also lots of overlap with KR concepts of slots,
properties, etc., going right back to predicate logic's idea of unary
predicates being described as property relations (a form of facet
relation). As Sowa says in his KR book, "the terms 'predicate' and
'relation' are often used interchangeably. [...] Predicates may be
defined either by intension or by extension." And we know that in set
theory, definitions based on intentionality are typically understood
as property-based definitions (e.g., the definition of a set is
described as entities that have a specific set of properties).

Since we've begun thinking about Topic Maps as structures for repre-
senting knowledge and knowledge structures, as well as their original
intent of assisting in the classification of content, it seems a good
time to tackle how facets and property-value relationships might fit
into the Topic Map architecture, particularly in XTM (since that's
our currently most popular implementation). I've begun working up an
API for TM4J for Topic Map-based inferencing, and a lot of these
issues pop up pretty quickly.

I've been planning a paper on 'Topic Map Facets' or 'Facets in XTM',
and have many pages of notes that I'll try to briefly summarize here,
with a number of open questions. But the surest way to kill a conversa-
tion before it starts (which I'm seeing right now on the SUO list) is
to ask a whole lot of questions right all at once, so I'll in this
message really just ask one, which is "what is the model for facets
in Topic Maps we'd like to use?". I first thought I'd write this up as
a web page, but it should probably first start as a discussion document.

~.................................................................~

Introduction
------------

We find a definition in ISO 13250 of facets. This definition doesn't
really match any existing definition from KR or library classification
or anything else: it's its own thing. But in the subsequent discussions
I've either read or been a part of, there seems to be a fair bit of
confusion about what facets in Topic Maps are, and what they do. This
has been confused by discussions about RDF, which to my mind have not
been particularly helpful in elucidating either definitions of facets
or a solution for us with Topic Maps. RDF doesn't do facets. It's not
in their set of definitions. But we're both "close." So what I'm asking
in this message is an either-or-both question, i.e., which model* of
facets are we talking about for Topic Maps.

~.................................................................~

Model A: The Metadata Model
---------------------------

The first model of facets might be considered as the RDF model,
where we're essentially adding property-value pairs to a resource.

                        o Property-Name (type)
                       /
                      /
        Resource o----
                      \
                       \
                        o Property-Value

This little triad or 'tuple' is basically found in a lot of places,
and RDF was hardly an innovation in this regard, nor did it even
define itself very well (there's a lot of confusion that remains
after years of argument about what RDF "means", though the religious
will claim they know). Basically, it's just graph theory applied in
the service of metadata relations. Graph theory is what is really
interesting underneath all the facade of "metadata", and it's graph
theory that has proven the useful part of RDF, especially since most
applications seem to need graphs, not trees.

The definition in ISO 13250 leans towards this model, in that the
idea was to create a way to attach properties to existing topics.
As Steve Pepper and others have noted, misunderstandings about what
the real goal of facets-in-13250 was, plus inherent limitations in
the AF for Topic Maps have rendered facets fairly useless. They
weren't included as part of the XTM syntax because it was generally
agreed that they could be implemented as a form of occurrence.

~.................................................................~

Model T: The Faceted Classification Model
-----------------------------------------

Faceted Classification (FC) is a very powerful concept coming from
library science. It's how modern classification is often done. Even
when classification is not done quite according to FC, FC has had an
enormous influence on most library classification schemes. In a
nutshell, FC defines its subjects not by creating a set of canonical
categories, but by collections of "facets". So if you were to open a
copy of Robert Brandom's "Articulating Reasons: An Introduction to
Inferentialism", you'd find its Library of Congress Cataloging-in-
Publication Data at the front of the book:

      1. Language and languages - Philosophy. 2. Semantics (Philosophy)
      3. Inference. 4. Reasoning. 5. Language and Logic. 6. Expression
         (Philosophy). I. Title.

This means that the subject of the book isn't simply one of these
topics, it's the entirety of them taken together. That last entry,
"I. Title." means that the topic of the book is one of its facets.
[I'm simplifying here, for any library science wonks out there.]

Anyway, this model of defining subjects as collections of facets is
all about subject identity, a topic near and dear to our hearts.
It's an extremely important idea that currently Topic Maps *could*
support, but don't directly. I've been playing with how to do this
within Ceryle, and both Kal and I have created PSI sets, done a
little experimenting, etc.

The thing about the FC model to remember is that the facets are
really about the knowledge model being described in the Topic Map,
about definitions of subject via accumulations of facets, properties,
characteristics, whatever you'd like to call them, i.e., about the
Topic Map "ontology", not really about the things (resources) the
Topic Map is describing/mapping. But the difference between these
two things isn't necessarily clear, and what constitutes the map
and what constitutes the territory is simply a matter of meta-ness,
which is recursive, or at least hierarchical (I think recursively
associative).

~.................................................................~

The Problem: Which One, or Both?
--------------------------------

If we think back to Paris and Eliot Kimber's diagram of Topic Maps,
this might bring a bit of an echo to some:

                              Association
                           ........o.........
                          /                  \
                         o Role               o Role
                        /                      \
                       /                        \
            Name o----@  ("binding point")       @ (another topic)
                       \
                        \
                         o Occurrence

Note the difference between this diagram and the RDF one, which
points out one of the essential differences and innovations about
Topic Maps: that binding point in the middle '@', the subject. RDF
is a triple/tuple, whereas the basic Topic Map graph structure is
in the form of a set of assertions surrounding a subject identity
point, which is what we're seeing finally described in the RM.

And as we know, occurrences turn out to be simply a specialized
association type. When we start to think about facets in the
RDF way of thinking, they look like:

                                  o Facet Property-Name
                        "        /
                       /     ----                 [NOTE: no '@']
                   "--@     /    \
                       \   /      o Facet Property-Value
                        \ /
                         o Occurrence/Resource

or, where the facet is a Topic in its own right:

                                  o Facet Property-Name
                        "        /
                       /     ---@ Facet
                   "--@     /    \
                       \   /      o Facet Property-Value
                        \ /
                         o Occurrence/Resource

The first of these facet approaches is very RDF-ish (i.e., a tuple/
triple), the second considers a facet as a Topic, which means that
it can play roles in facet hierarchies, have names, occurrences of
its own, etc.

Obviously, the first RDF-ish version is the simpler of the two and
incurs less overhead (and probably more closely agrees with 13250),
but I think the second is more Topic Map-ish. The latter also provides
no strong separation between what is in the facet "ontology" and what
is in the Topic Map "ontology". This I'd consider a major benefit, as
in modeling there are usually many overlapping layers, where one topic
plays a role of a facet in some facet hierarchy, but may be "faceted"
by other topics. This mesh or web of interconnections comes to my
mind close to what I've called the "recursively associative" model
of human thinking and memory, and is one of the reasons I'm excited
about Topic Maps.

So, the question might be, which of these two models is closest to
the definition in ISO 13250? Or, perhaps more importantly, since
either or both can be implemented within the Topic Map model and
specifically in XTM, what's the appropriate way to design both the
language and the XML syntax for describing "facets" in Topic Maps?
As a further incentive to discussion, anyone taking part in this
discussion here will get a mention in my upcoming paper, for what
that's worth... :-)

Murray

* I've named these Model A and Model T after Henry Ford's first two
automobiles, which weren't competitors to each other, but rather
complementary ideas about what kind of car people wanted. Both were
originally only available in black. Colour came later.
...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

     Q: So exactly how is Ahmad Chalabi different from Manuel Noriega?

        http://www.nationalreview.com/comment/comment-singer062002.asp
        http://www.iraqinews.com/people_chalabi.shtml
        http://truthout.org/docs_03/041103F.shtml

     A: Both paid by the CIA. One speaks fluent Arabic, the other Spanish.

       "Noriega took refuge in the Vatican embassy, where US troops played
        hard rock music until Noriega surrendered on January 3, 1990."
        http://www.nationmaster.com/encyclopedia/Manuel-Noriega
        http://www.gwu.edu/~nsarchiv/NSAEBB/NSAEBB2/nsaebb2.htm#3a
        http://www.addictedtowar.com/panama.htm