[topicmapmail] occurrence abuse ? Was: [geolang-comment] First proposals for ISO 639 and 3166 available

Jan Algermissen algermissen@acm.org
Tue, 20 Aug 2002 14:28:09 +0200


The message below is taken from the OASIS geolang-comment list. It addresses an
issue that I think of as 'abuse of occurrences'. I think this is of general interest
so I repost it here.


<quote>

Murray Altheim wrote:
> 
> John Cowan wrote:
> 
> > Lars Marius Garshol scripsit:
> >
> >
> >>Basically whether the strings in question are human-oriented names or
> >>labels for the subjects[1]. In my opinion these strings are clearly
> >>occurrences rather than names. The country at the south tip of Africa
> >>has names like 'South Africa' and 'Afrique du Sud', but not like 'ZA',
> >>'ZAF', or '710'.
> >>
> >
> > Fair enough, but I still feel uncomfortable with saying that the string
> > "710" is an *occurrence* of the country denoted by "South Africa",
> > in the same sense that http://www.law.indiana.edu/uslawdocs/declaration.html
> > is an occurrence of the (U.S.) Declaration of Independence.
> > I may be misled by the use of the term "occurrence" here, though.
> >
> > OTOH, "710" does name (within a suitably restricted scope) that country,
> > in the sense of being a unique label for it.  The most central sense of "name",
> > namely personal name, is not a basename in an XTM sense: there are other
> > John Cowans within any non-arbitrary scope (there is another within
> > Reuters, e.g., and we sometimes get each other's mail).
> 
> This is precisely why in the design of the XTM 1.0 country and
> language topic maps I used names rather than occurrences (which
> I had considered). "710" *is* a name for a country within the
> scope of the UN code base. It's not an occurrence of that country.
> Now, as we all know, the language used to describe the topics and
> scopes heavily influences how this all fleshes out; errors are
> easy to make.
> 
> I also have difficulty with using occurrences for another reason.
> Occurrences are (to my mind) the territory being mapped by a topic
> map, not the map. These topic maps are themselves being used as
> maps, such that "occurrences" would be what "users" populate the
> maps with. [I quote those words because I think the potential uses
> of these topic maps so wide as to make such characterizations a
> bit misleading.] I hope you get my drift though: occurrences are
> quite different than names in the topic map paradigm. They exist
> across the gulf from each other.
> 
> Murray


</quote>

I think that it is an error to use occurrences for anything else than
for the relationship between a subject and a resource whose content is
dealing with the subject in some way. In particular I think it is an abuse
of the notion of 'occurrence' when <resourceData> elements are used to
assign properties to topics.

Example:

<topic id="t1">
  <baseName>
    <baseNameString>Donald Duck</baseNameString>
  </baseName>
  <occurrence>
    <instanceOf><topicRef xlink:href="#email" /></instanceOf>
    <resourceData>dduck@hotmail.com</resourceData>
  </occurrence>
  <occurrence>
    <instanceOf><topicRef xlink:href="#birthdate" /></instanceOf>
    <resourceData>01.01.1930</resourceData>
  </occurrence>
</topic>

I think that neither the email address nor the birthdate is an
occurrence of the topic, I think that they are properties and that they
should be assigned to the topic via the use of associations (e.g. with
a bornBeing-birthdate association). [1]
Also, making the date a resource 'hides' the fact that it is a subject
in it's own right and it makes it much harder, for example, to lookup
all beings that are born on that particular date.

Since the <resourceData> element makes it so dangerously easy to take anything
that comes in form of a string (dates, addresses, measurements (e.g. height
of a person)), wrap it inside <resourceData> and make it an occurrence of 
a topic I am curious 


* if this was actually the intention of the XTM authoring group


* what others think about this issue and in particular how others
  solve the assignment of properties to topics


Jan


[1] Since we can consider the email address to be an unambigous identifier
    of the topic, it might be a solution to make it a base name.




-- 
Jan Algermissen
Consultant & Programmer

Tel:   ++49 (0)40 89 700 511
       ++49 (0)177 283 1440
Fax:   ++49 (0)40 89 700 841 
Email: algermissen@acm.org
Web:   http://www.topicmapping.com