[topicmapmail] Fragmented XTM for web metadata, and some ontology?

Murray Altheim m.altheim@open.ac.uk
Fri, 27 Jun 2003 01:13:19 +0100


Kal Ahmed wrote:
> Alexander Johannesen proposed:
> 
>>>  <topic id="dc:type">
>>>     <subjectIdentity>
>>>        <subjectIndicatorRef
>>>            xlink:href="http://purl.org/dc/elements/1.1/type" />
>>>     </subjectIdentity>
>>>  </topic>
>>>
>>>( id-attributes follow the DC recomended guideline for DC in XML )
>
> and On Thu, 2003-06-26 at 14:29, Murray Altheim wrote:
> 
>>I'm not sure when the guidelines were written, but the colon is prohibited
>>as a name character nowadays, in accordance with the XML Namespaces 1.1
>>Recommendation:
>>
>>   http://www.w3.org/TR/xml-names11/#Conformance
>>
>>The reason it was allowed as a Name character was to allow experimentation;
>>in the end I believe it was decided that "namespaced IDs" didn't make much
>>sense. Thank God for that.
> 
> Perhaps you can use the dc.title form that is used in the HTML
> representation of DC instead ?

Yes, that's what I believe is recommended for DC-in-HTML and what I've
suggested in Augmented Metadata. It's safe and well-understood as a way
of representing DC content in attribute values. I think (unless I'm
mistaken) that you should capitalize the "DC" (rather than "dc:type"
it should be "DC:type"). There's a way of "declaring" namespace prefixes
advocated in DC that is described within their documents, and we could
create a method of doing that in XTM without such trickery.

>>>I had some problems groking the associative DC metadata in Jan's 
>>>document, but that might be because I don't understand why
>>>they should be expressed that way. Maybe my thinking is very
>>>slow, but I'd like an ontology (called FXTM for now) ;
>>>
>>
>>[example elided...]
>> > My thinking says that for a given page, you have the following XTM;
>> >
>>
>>>  <!-- root node is the fragmented XTM's resource -->
>>>
>>>  <topic id="root">
>>>     <instanceOf>
>>>        <topicRef xlink:href="#fxtm:page"/>
>>>     </instanceOf>
>>>     <occurrence>
>>>        <instanceOf>
>>>           <topicRef xlink:href="#dc:title" />
>>>        </instanceOf>
>>>        <resourceData>Some title</resourceData>
>>>     </occurrence>
>>>     <occurrence>
>>>        <instanceOf>
>>>           <topicRef xlink:href="#dc:date:publish" />
>>>        </instanceOf>
>>>        <resourceData>[Some date]</resourceData>
>>>     </occurrence>
>>>  </topic>
>>>
>>>Where the TMs for FXTM and DC are implicit part of the
>>>namespaces used (and merged in according to the FXTM
>>>spec?).
>>
>>I don't think it's appropriate semantically to include properties
>>of a given topic as occurrences of that topic -- it's just not
>>sensible, doesn't "read" right to me. You yourself stated the sentence:
>>
>>     "here is the resources"
>>
>>which is essentially saying "here are the untyped values". Since we
>>want to both name and type our values, and because occurrences don't
>>have names, each facet is a topic.
> 
> This is where I differ. My understanding of the semantics of occurrence
> is that an occurrence specifies some information that is in some way
> related to the subject. I do understand that a number of other topic map
> practitioners believe that an occurrence is only a resource in which the
> subject is mentioned or described in some way - so whereas I would say
> that metadata such as creation date can be expressed in an occurrence,
> they would point at the occurrence and say "where's the subject ?" :-)

It just sounds wrong to me stated as a simple English sentence, which is
usually a good approach to judging semantics. I would never say that a
property of a topic is also an occurrence of a topic. My mother has blue
eyes is a property, whereas an occurrence of my mother is either my
mother or some addressable resource about her, etc. I can't think of
many properties of things that are even in the same branch of the ontology
as the thing itself.

> With regards to the DC title element I have more sympathy for Jan
> Algermissen's analysis that dc.title == baseName.

Well, we have to decide whether baseName is closer to DC.title or
DC.subject. I agree with you that a title is a descriptor of a topic,
and that DC.subject is probably *close* to subjectIdentity.

>>>I don't know. Maybe I'm mixing things up too much, or
>>>maybe I'm missing the incredible power of making the
>>>occurrences topics instead with associations binding them, but I feel 
>>>that becomes more of an application specific way than a general "here 
>>>is the resources" way;
>
> What the RDF people found was that if you have literals you reduce the
> reliability of merging (smooshing to use the RDF term...no, really!) but
> perhaps for a meta data element that is a value from a range (e.g. date,
> age, size, GPS position) that is all you can do scalably.

I'd not want to generalize that far. RDF and TM have many uses, and if
I'm representing an ontology (in the "traditional" sense of KR), I really
want slots, quantification and all the other things necessary to state
sentences. I'm less worried about merging at that point. Generally, a topic
that is a class (rather than an instance) is only going to express facet
ranges and perhaps default values, whereas an instance topic is going to
have "real" values (such as "real" birthdays, eye colors, heights, weights,
distances from New York to Paris, etc.).

> <snip/>
> 
>>>Basically, what I want is a PSI set for the FXTM set
>>>(both roles and types), and a set of PSIs for DC. The
>>>latter is the easy part. Where should I turn for the former? Is there 
>>>anyone looking into a loose web browsing ontology?
>>
>>You might look at the "topic map-based" XFML, at http://xfml.org/
>>which I believe is designed to solve either the same or a similar
>>problem (depending on how I understand your question).
>>
>>By some coincidence, I happen to be working up a short specification
>>for doing facets in XTM, with the idea that you can associate a
>>name-value pair to a given topic.
>>
>>BTW, Kal Ahmed has a PSI set for faceted classification (FC), and
>>we're discussing various aspects of it right now. I'm looking at the
>>description of facets in ISO 13250 as well as how FC works, as while
>>the use of "facet" is the same, there are some differences (as might
>>be imagined from terms coming from different communities). I'm
>>particularly interested in the overlap, for my own work.
>>
>>I hope to publish the "Facets in XTM" draft within a week or so. If
>>I haven't by July 10th, it'll probably be in two weeks after that
>>(as I'm out of town and probably disconnected). I'll be updating my
>>"Datatypes in XTM" document at that time as well, as it is used in
>>typing the facet values, e.g.
>>
>>   "Birthday" has type "dateTime" and value "1961-07-04T20:00:00Z"
>>   "Birthday" has type "date" and value "1961-07-04"
>
> Ranges are the issue that I haven't addressed with my proposal for
> facetted classification - in my model a facet value is taken from some
> hierarchical classification scheme. So the facetted classification model
> that I propose is merely a way of indexing topics by multiple disjoint
> hierarchies.

Which is a very valuable thing to do, no question. But I *think* we may
be mixing up the two kinds of facets that have been discussed, which is
likely my fault. I'm interested and working on *both* FC and TM facets,
and there is a relation between them that can be exploited. And with
the Datatypes in XTM being based in XSD, we get facets (as defined
there) for free, i.e., we get min and max values, etc.

>>I have had a DC topic map for about two years now, unpublished. Since
>>that might be useful with the facet TM, if I have time I'll also put
>>that online.
> 
>>Since the term "facet" also occurs in XML Schema datatypes, I'm
>>starting to feel a bit surrounded by the damn things, like some
>>kind of bad drug trip...
> 
> As long as the topics don't turn into lizards and you don't end up in a
> Las Vegas casino with your lawyer, you'll do just fine :)

Heh. Friday night is gin night!

Murray

...........................................................................
Murray Altheim                         http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK                    .

        "There's a lot of intelligence out there that you don't
         know if it's true or not."  -- Anonymous US official
         http://news.bbc.co.uk/1/hi/world/middle_east/3014850.stm