Use and abuse of occurrence RE: [topicmapmail] Are
FacetsReally Simple After All?
Kal Ahmed
kal@techquila.com
01 Dec 2003 13:06:01 +0000
On Mon, 2003-12-01 at 12:28, Jan Algermissen wrote:
> Kal Ahmed wrote:
>
> > From what follows I suspect that you are talking
> > about properties of the subject, not properties of the topic, right ?
>
> Right. (Sorry, that wasn't very precise.)
>
>
> > > b) <occurrence> really has the semantics that make it suitable for
> > > using it to assign properties to a topic.
> > >
> > > If it is b) then we are fine but if it is a) I'd rather have XTM extended
> > > (fixed) than to carry this 'error' onward.
> > >
> >
> > If you are talking about properties of a topic, I have already outlined
> > in my reply to Murray how occurrences can be used to represent that (by
> > creating a topic whose subject is the topic to which you wish to apply
> > the meta data). If you are talking about subject properties, then I
> > don't have a problem with the use of occurrences for this.
> >
> > > I am deeply convinced that a) is correct and that XTM should be extended
> > > with an element that allows assigning property value pairs to topics.
> > >
> > > What do others (especially authors of topic maps, not XTM software developers)
> > > think about this?
> > >
> >
> > That excludes almost everyone who has taken part in this conversation so
> > far ;-)
>
> Well...I seek the opinions of end-users.
>
Hmm, but I am an end user. I just happen to develop software too.
> >
> > >
> > > Some issues to consider:
> > >
> > > * All interpretations of the <occurrence> element I have seen so far
> > > (PMTM4, various APIs, the current SAM draft) regard an occurrence
> > > (that is: the relationship between topic and the other end) as someting
> > > with identity (occurrences have type(s), scope, are represented as
> > > objects in their own right, etc). This conceptual and implementation
> > > overhead does not seem to match the semantics of a simple property
> > > (e.g. "Jim weighs 150kg")
> > >
> >
> > > * Two very important properties of topics: the SubjectAdress and the
> > > SubjectIndicators are *not* interpreted with the complexity of
> > > an occurrence - they are understood to be simply propety class/value
> > > pairs attached to a topic. So, why the overhead for "Jim's age is 34"
> > > but not for "Jim has {URI} as the set of subject indicators". Makes
> > > not much sense to me.
> > >
> >
> > The subject address and subject indicators do not need typing,
>
> Huh? Sure those properties have a type - how else could one make
> sense of the values?
>
I am trying to keep things clear. Obviously failing ;-) There are
properties that express the subject address and subject indicators of a
topic. But the values of those properties are not typed. The
[occurrences] property of a topic consists of a sequence of typed
values.
> > they do
> > not need scoping. But given the topic "Jim" and the string "34", how do
> > I establish the age property without a type.
>
> I am not saying that property values do not need a property type! All I am
> saying is: lower the required overhead.
>
OK, so you conceed that typing is not an overhead.
> > How do I establish that my
> > statement is only valid in the context of "my best guess from looking at
> > Jim" ? I would need scope to do that.
>
> Is that the usual use-case for simple properties????
>
Why do you want to prevent this use case ?
> If you need the overhead: reify the property-value and use an assertion, what is
> the problem?
>
What is the problem with scope being optional ?
> >
> > Both type and scope are optional.
>
> I am not only taling about type and scope! You still need more complexity
> to store an occurrence than to store a value.
>
An occurrence consists of a value (string or locator), a type, and a
scope. Thats it. Where is this additional complexity.
> > So if your model of the world does not
> > need the "overhead", then don't use those parts of the topic maps model.
> >
> > > * XTM is just a single format for representing topic map information,
> > > I don't think it is wise to use XTM as a base for arguing.
> >
> > That XTM is a single format for representing topic map information is
> > true, but all that I have said above applies equally to the Topic Maps
> > Data Model.
>
> Which one? The SAM draft?
>
ISO 13250-2 that is in committee draft.
> >
> > > IOW: if
> > > we allow XTM to dictate that any abstract model for topic maps does
> > > not have a property facility we impose unneccessary complexity on
> > > interpretations of other formats. Consider processing RDF represented
> > > Dublin Core into a topic map....is it really clever to insist that all
> > > attributes of a resource are to be represented as occurrences? Just
> > > because XTM lacks a property facility?
> >
> > What would a property facility provide that is not provided by
> > occurrences ?
>
>
> * less overhead for topic map authors
In what sense? Syntactically ? I thought we agreed not to discuss the
XTM syntax and its shortcomings. If you mean that occurrences carry
semantic baggage that is overhead for topic map authors, perhaps that is
true to some extent. But I don't think that learing "occurrence = type,
scope and value" is a lot of "overhead".
> * more space and time efficient implementations
>
Surely that is an implementation consideration. You could do that now
with occurrences. Perhaps setting aside a special table for occurrences
of a specific type or of a specific meta-type. Perhaps restricting such
occurrences to contain only string values. Perhaps with some other funky
application-specific optimisations. No one is stopping you. No one is
stopping you from creating your own syntax either. But I still fail to
see the need to change the Topic Maps Data Model.
>
> Not enough?
>
>
> > AFAIK RDF statements *are* resources and *do* have identity, so the
> > assignment of "34" to "Jim" is something that can be identified in the
> > RDF model, just as an occurrence of the topic "Jim" is something that
> > can be identified in the XTM model, so I don't see the divide that you
> > imply here.
>
> Seriously: if you have 1 million subjects that all have age, height, weight, etc.
> you don;t care to create an occurrence foe each of these properties?
>
No, I don't mind doing that at all. Why should I ? I can still optimise
my engine (or as an end user choose the engine that is optimised) to
handle this case. I am not prevented from doing so by the Topic Maps
Data Model.
> I don't get that. Is really everyone comfortable with this?
>
I'm speaking only for myself. Not for anyone else.
>
> >
> > >
> > > * Introducing a new element does not harm existing XTM documents, they
> > > can easily be transformed into instances of the new DTD.
> > >
> >
> > Just because it can be done doesn't mean it should be ;-) What is the
> > real objection here ?
> >
> > Is it that you want simpler syntax? I guess not from your comment about
> > XTM
>
> I want a more self-consistent model I think. Why not represent SubjectAddress
> and SubjectIndicators as occurrences too? Why have additional constructs in
> the model if occurrence can do that? Why make the model more complicated than
> it needs to be?
>
Because occurrences are not used to establish identity and there is no
mechanism in topic maps for saying "this property establishes identity"
and "this property does not". Thats one of the powers of the Topic Maps
Reference Model, but its not part of the Topic Maps Data Model.
> >
> > Is it that you want a simpler data model ? Well you would still need to
> > type your properties so you only lose the scope, and thats optional
> > anyway.
>
> see above.
>
> > Is it that you want a clearer semantic distinction between the use of a
> > literal string or resource as a subject meta data value and the use of
> > the literal string or resource as a piece of information that is not
> > meta data ? If so, then what is wrong with defining PSIs for this
> > distinction (e.g. as a meta-type: type the "Age" topic as "Subject
> > Property Type", using a PSI for "Subject Property Type")
> >
> > Or is it something else that I haven't understood from the foregoing ?
>
> I think that I fail to express the underlying motivations/reasons that lead
> to my concerns.
>
> Maybe it helps to try the other way round:
>
> Do you think that a data model (SAM draft) that uses several constructs
> (SubjectAdress,SubjectIndicators propeties and occurrence) to achive something
> that could be done with a single one (occurrence) is optimal? Same goes for
> base name of course since that can be represented as an occurrence too.
>
> So, why is the model more complex than it needs to be?
>
Because if you don't you end up with RDF. Sorry, thats a flippant
answer. The reason is that the topic maps data model provides a certain
degree of semantics beyond a simple graph of objects with properties. In
my book thats a Good Thing and I also happen to think that the balance
in the current Topic Maps Data Model is about right.
Cheers,
Kal
--
Kal Ahmed, Techquila
Standards-based Information Management
e: kal@techquila.com
w: www.techquila.com
p: +44 7968 529531