[topicmapmail] Identities and names (WAS - A somewhat new topic maps format)

Bernard Vatant bernard.vatant@mondeca.com
Fri, 8 Aug 2003 12:12:53 +0200


Murray

Just a complement to my parallel answer to Daniel

> > "Bunzilla" is the (or a) name of Murray's rabbit
>
> Actually, only one, her informal name. Her formal name is (honestly)
> "Burroughs Deeply Underhay" (the first name after William S. since
> she shares his attitude, the second after "Truly, Madly, Deeply", and
> the third after the surname of the guy I bought my Mini from).

Hmmm ... is This Your Own Invention ?

> > "http://www.altheim.com/bunny/#bid0537" is what the name "Bunzilla" is
> > called in the scope of Murray's topic map of bunny names and
> the subject indicator formalism.
>
> Hmm. I'd say the bID is an identifier for the name. More below.

Agreed

> > "X is called 'A'" is synonymous with "'A' is a name of X"
>
> Here's where perhaps we can try to clear up some terminology. (Bear with
> me here, I'm trying out some new shoes). An identifier is a subclass of
> name that confers unique identity within a specific context or scope.

... context including a specific mechanism or protocol through which
identification is processed

> for example, a US Social Security Number uniquely identifies an
> individual within the US Social Security Administration (and increasingly
within
> every part of government and private life too, despite their original
> intention), whereas there may be many individuals sharing the same name.

Good example ... and SSN identifiers are used in many contexts and
different protocols *beyond their original intention*. As URIs are, and as
any name has and will be. We have to take that into account ...

>      "http://www.altheim.com/bunny/#bid0537" is a unique identifier
>      for the name "Bunzilla" in the scope of the Comprehensive Bunny
>      Name List.

... because there is, in this scope, a definite identification protocol,
although what this protocol is exactly should certainly be difficult to
explicit, since it deals both with computer ability to resolve fragment
identifiers, and human interaction with the system. Tricky stuff ...

> Since there is an inherent domain built into the identifier itself
> ("http://www.altheim.com/" or "http://www.altheim.com/bunny") we
> have the advantage here of a built-in context or scope. Now, to make
> this more complicated I might point out that I made a mistake in
> posting Bunzilla's bID, as the official domain of bIDs is not
> altheim.com but a PURL that happens to resolve to it, so her real
> bID is actually
>
>     http://purl.org/ceryle/bunny/#bid0537
>
> This means that there are two unique identifiers for "Bunzilla"
> (the name): an official one, and an unofficial one. Then again,
> there's also the IP address version too:
>
>     http://132.174.1.35/ceryle/bunny/   [via purl.org's IP address]
>     http://155.212.7.37/bunny/          [via altheim.com's IP address]

That's why a subject indicator should declare which "canonical" URI should
be used as its subject identifier. This is one of the requirements of
PubSubj TC Specification. Exactly to avoid the use as subject identifiers
of so many URIs that actually redirects to the same subject indicator ...

> Now, I point this out to hilight the fact that behind the scenes of
> all this is a regression of name resolution issues, not just here and
> on computers, but in "real life" too. This is what I think confuses
> TimBL in his classic name-address myth nonsense.

Agreed. But my bunny Timberlee does not make this confusion :)

> Now, in order to create a unique identity an identifier (a kind of
> name) must be used within a context or scope.

Hear, hear. This is the central issue ...

> The regression of naming
> works in progression in scoping, such that to *further* identify an
> individual using an identifier within a scope, one then takes that
> identifier and uses it within a new scope to provide further identity,
> e.g., one could take the bID for the name "Bunzilla" (since that is
> an appropriate identifier for my bunny), and provide further scope
> or context by saying she's "Murray's bunny" or use her mailing address
> or geospatial location or mother's maiden name, etc., noting that
> each of those identifiers also has scope/context. So, in reality,
> there is no name/identifier differentiation, just a matter of level
> of contextualization...

Exactly. This is perfectly clear (to me at least). Let me try to state it
otherwise:

Names can be defined in some absolute "ontological" way. You can define a
class "Name" as a subclass of "String" and say "Bunzilla" is an instance of
this class "Name". That means that string has been used at least once by
someone to name something. Note that, in that respect, you are perfectly
right to insist to include in your PSI list only names that have *actually*
be used. In that sense, URIs are names, since people use and abuse of them
to name all sorts of things.

OTOH, there is nothing such as an absolute identifier, but there are
process of identification, which use names. An identification process is
using names in a specific context, through a specific protocol, to figure
if two things (subjects) are identical or not.

Note that part of the identification process is to check if the name is
well-formed, IOW any identification context-protocol uses only a certain
class of well-formed names (called maybe a vocabulary).

In that perspective, an identifier is simply a name used in some
identification process. Of course the same name, used by different process,
can identify different things. So interoperability between different
identification process should be based on a formal agreement about a shared
class of well-formed names, and that any such well-formed name identifies
the same subject independently of the difference between process.

Very simple, after all. That's what we do every day ...

Well, this post has grown longer than intended, sorry about that.

Bernard

Bernard Vatant
Senior Consultant
Knowledge Engineering
Mondeca - www.mondeca.com
bernard.vatant@mondeca.com