[topicmapmail] Fragmented XTM for web metadata, and some
ontology?
Kal Ahmed
kal@techquila.com
27 Jun 2003 09:03:16 +0100
On Fri, 2003-06-27 at 07:34, Alexander Johannesen wrote:
> Alexander wrote:
> >> which is essentially saying "here are the untyped values". Since we
> >> want to both name and type our values, and because occurrences don't
> >> have names, each facet is a topic.
>
> Kal wrote:
> > This is where I differ. My understanding of the semantics of occurrence
> > is that an occurrence specifies some information that is in some way
> > related to the subject. I do understand that a number of other topic map
> > practitioners believe that an occurrence is only a resource in which the
> > subject is mentioned or described in some way - so whereas I would say
> > that metadata such as creation date can be expressed in an occurrence,
> > they would point at the occurrence and say "where's the subject ?" :-)
> >
> > With regards to the DC title element I have more sympathy for Jan
> > Algermissen's analysis that dc.title == baseName.
>
> As much as I understand and _somewhat_ agrees with that, it does not
> sound right for me to do "dc.date.publish == baseName". My thinking
> in terms of occurrences is that we could probably all get along much
> better if we deprecate "resourceData". :)
>
Then we agree on that.
> > What the RDF people found was that if you have literals you reduce the
> > reliability of merging (smooshing to use the RDF term...no, really!) but
> > perhaps for a meta data element that is a value from a range (e.g. date,
> > age, size, GPS position) that is all you can do scalably.
>
> The term "everything is a topic" is a thin line between "everything
> is semantic" and "everything is a property". When doing TMs, I tend to use
> occurrences for anything pointing out of the current document
> and for one-time and / or unique data. I guess this clubs down on
> the versatility of TMs, but then again, there are no standardised
> models to use, only a few good best-practices that some might use?
> (Your article that Murray linked to is such a document)
>
As are Murray's proposals for facet PSIs. Having read the arguments in
the thread preceeding this posting I can absolutely see the value of the
facet concept that is being put forward and the use of associations for
expressing facet values.
Expressing the facet-as-subject-property concept using associations has
the distinct advantage that for a given set of topic maps, the value set
for a given property can be easily computed which would make an
efficient index a pretty straightforward thing to write. That is in
addition to the expressive power that one gets from being able to make
assertions about the facet values (e.g. I can express exactly what I
mean by the facet value "blue").
On the other hand there are some properties that I just don't want to do
that for. E.g. dates and date ranges. And there will be some cases where
the power that the association model is just too heavy for my
application (e.g. I want to reduce the number of traversals made by the
application) or doesn't fit with the rest of my model (e.g. I have made
a decision that values I consider to be metadata shall not be
represented as topics).
There seems to be at least two practices coming out of this discussion
with strengths and weaknesses. In the paper on facetted classification,
I introduced a form of pattern description (using UML diagrams and
prose) that I feel goes beyond PSIs and is aimed precisely at describing
patterns such as these. Perhaps as things emerge from this discussion
I'll try and express the different patterns (as I understand them) and
publish those too.
> I'm a minimalist trying to keep the overhead (and number of objects)
> low. Maybe I shouldn't be too concerned with this, though; those
> excelent TM4J and OKS guys can figure out how to handle these things
> quickly and stable. :)
>
Thats another tradeoff to consider too - of course the purist in me puts
the modelling considerations before the tool restrictions, but in
reality when you have a deadline for delivery and a customer expecting
performance you have to work with what you have.
> ...
>
> > As long as the topics don't turn into lizards and you don't end up in a
> > Las Vegas casino with your lawyer, you'll do just fine :)
>
> Happened a lot lately, Kal? :)
>
Well I have been staring at a *lot* of TM4J code recently :)
Cheers,
Kal