[topicmapmail] Fragmented XTM for web metadata, and some ontology?

Kal Ahmed kal@techquila.com
27 Jun 2003 09:18:46 +0100


On Fri, 2003-06-27 at 01:13, Murray Altheim wrote:
> Kal Ahmed wrote:
<snip/>
> > With regards to the DC title element I have more sympathy for Jan
> > Algermissen's analysis that dc.title == baseName.
> 
> Well, we have to decide whether baseName is closer to DC.title or
> DC.subject. I agree with you that a title is a descriptor of a topic,
> and that DC.subject is probably *close* to subjectIdentity.
> 
I too would say that DC.subject is closer to subjectIdentity. But it is
the identity of the subject that the resource with the meta data is
about. Lets say that I have DC meta data on a page about TM4J. It would
make sense if I put DC.subject="TM4J" (using whatever form of DC
encoding is appropriate), so I am saying that the page is about TM4J,
not that it represents TM4J. Perhaps this is what you were saying all
along, but as I see it the DC.subject assertion is saying that this
resource could be an occurrence of the topic "TM4J" or in extremis a
subjectIndicator for the topic "TM4J". I think that without any other
knowledge of the domain than just the DC metadata, I would go with the
occurrence view over the subjectIndicator view.

> >>>I don't know. Maybe I'm mixing things up too much, or
> >>>maybe I'm missing the incredible power of making the
> >>>occurrences topics instead with associations binding them, but I feel 
> >>>that becomes more of an application specific way than a general "here 
> >>>is the resources" way;
> >
> > What the RDF people found was that if you have literals you reduce the
> > reliability of merging (smooshing to use the RDF term...no, really!) but
> > perhaps for a meta data element that is a value from a range (e.g. date,
> > age, size, GPS position) that is all you can do scalably.
> 
> I'd not want to generalize that far. RDF and TM have many uses, and if
> I'm representing an ontology (in the "traditional" sense of KR), I really
> want slots, quantification and all the other things necessary to state
> sentences. I'm less worried about merging at that point. Generally, a topic
> that is a class (rather than an instance) is only going to express facet
> ranges and perhaps default values, whereas an instance topic is going to
> have "real" values (such as "real" birthdays, eye colors, heights, weights,
> distances from New York to Paris, etc.).
> 
Absolutely. The question is whether one wants to create a topic for each
value in the range (or even for each distinct value posessed by your set
of instances). Whenever this issue comes up, I think "One day I really
need to revisit Hytime FCS" - perhaps there is a way of expressing a
point or range in an FCS in URI syntax. Then one could create instance
values using subject indicators that are computed as a point/range in an
FCS and encoded as a URI. Add a bunch of code that can do FCS transforms
and find intersections between FCS ranges and you have a system for
expressing ranges and values in multiple dimensions and determining when
two values have some overlap (so going beyond simple URI comparison for
merging and enabling more complex merging rules for such values).

Then I get bombarded with other work and the dream (?) of rereading the
relevant bits of HyTime gets pushed to the bottom of the stack.

> > <snip/>
> > 
> >>>Basically, what I want is a PSI set for the FXTM set
> >>>(both roles and types), and a set of PSIs for DC. The
> >>>latter is the easy part. Where should I turn for the former? Is there 
> >>>anyone looking into a loose web browsing ontology?
> >>
> >>You might look at the "topic map-based" XFML, at http://xfml.org/
> >>which I believe is designed to solve either the same or a similar
> >>problem (depending on how I understand your question).
> >>
> >>By some coincidence, I happen to be working up a short specification
> >>for doing facets in XTM, with the idea that you can associate a
> >>name-value pair to a given topic.
> >>
> >>BTW, Kal Ahmed has a PSI set for faceted classification (FC), and
> >>we're discussing various aspects of it right now. I'm looking at the
> >>description of facets in ISO 13250 as well as how FC works, as while
> >>the use of "facet" is the same, there are some differences (as might
> >>be imagined from terms coming from different communities). I'm
> >>particularly interested in the overlap, for my own work.
> >>
> >>I hope to publish the "Facets in XTM" draft within a week or so. If
> >>I haven't by July 10th, it'll probably be in two weeks after that
> >>(as I'm out of town and probably disconnected). I'll be updating my
> >>"Datatypes in XTM" document at that time as well, as it is used in
> >>typing the facet values, e.g.
> >>
> >>   "Birthday" has type "dateTime" and value "1961-07-04T20:00:00Z"
> >>   "Birthday" has type "date" and value "1961-07-04"
> >
> > Ranges are the issue that I haven't addressed with my proposal for
> > facetted classification - in my model a facet value is taken from some
> > hierarchical classification scheme. So the facetted classification model
> > that I propose is merely a way of indexing topics by multiple disjoint
> > hierarchies.
> 
> Which is a very valuable thing to do, no question. But I *think* we may
> be mixing up the two kinds of facets that have been discussed, which is
> likely my fault. I'm interested and working on *both* FC and TM facets,
> and there is a relation between them that can be exploited. And with
> the Datatypes in XTM being based in XSD, we get facets (as defined
> there) for free, i.e., we get min and max values, etc.
> 

After reading this I now properly understand the differences between
what we are talking about. My use of "facet" is much closer to that of
XFML and has nothing at all to do with W3C Schema facets.

Cheers,

Kal