[topicmapmail] Re: Document Object Identifiers/CrossRef

Daniel Rivers-Moore Daniel.Rivers-Moore@rivcom.com
Mon, 9 Sep 2002 13:26:45 +0100


Eliot

If I've stimulated you to convince yourself that URLs are not enough,
that's an achievement I think I can be proud of! I well remember the
discussion with you some time back in which *you* convinced *me* that a
locator is just a name in another form - but I was still unconvinced
that the consequence of this was that you only needed the locator. The
point is that "a name in another form" and "a name" are *not* the same.
The name and its form are both significant - if we need to separate
'form' from 'content' it is not because form is unimportant. It is
rather because form and content are *both* important, in different ways,
and it is vital that they not be confused with one another. Similarly,
the *intent* behind a name (a string intended to identify a thing) and a
locator (a string intended to identify another thing with a special
relationship - 'location' - to the first thing) are not the same. You
very wisely observe that this difference in *intent* gives rise to
differences in *expectations*, and that is crucial.

One other point - You talk about what happens when URLs change because
the owner of a resource rebrands and so wants the resource to be
referenced via a new URL. You say "it can be very difficult to find that
resource's new location".

More than "difficult", I'd say. Rather "impossible in the general case".
What I mean by this is that there is *no algorithm* for finding it -
even if everybody involved obeys all the rules and all the norms of good
practice.

You have clearly spent some time studying the DOI material, so perhaps
you know the answer to the following question: Does the DOI paradigm
provide an algorithm for finding something after ownership of it has
changed? And a related question: Does the DOI paradigm provide norms of
good practice for owners of resources to follow when the ownership of a
resource changes hands? In other words, is there a specified way of
leaving a 'forwarding address' when you pass ownership of a resource to
someone else?

Thanks in advance

Daniel

-----Original Message-----
From: W. Eliot Kimber [mailto:eliot@isogen.com]
Sent: 07 September 2002 14:11
To: topicmapmail
Subject: Re: [topicmapmail] Re: Document Object Identifiers/CrossRef


Daniel Rivers-Moore wrote:

> Does anyone on this list have a sense of whether there is a strong
lobby
> to get W3C to open its attitude in this regard? I am RivCom's W3C rep,
> but don't have the bandwidth to follow all the issues, so have not
> caught up on this one for some time. Certainly in the old days there
was
> a kind of religious war over whether the URN had any intrinsic value
> over the URL. Not whether URN was 'better' or 'worse' than URL, but on
> whether 'URL alone' was better than 'URL and URN, each used
> appropriately'.

The last time I discussed this with Dan C. (about a year ago, I think)
he re-iterated Tim B-L's argument made in a paper on the W3C site that
there's no useful difference between URLs and URNs--they're both just
magic strings. One key aspect of the argument was/is that the
indirection provided by URNs can be provided today by any Web server and
that, in any case, it is always the resposibility of the manager of the
resource to maintain the ability to resolve it, which either means
maintaining something like the mappings provided for by the DOI/CrossRef
infrastructure and PURL systems or maintaining a redirect mapping on
your local server.=20

I keep going back and forth in my own mind on this issue--I was raised
on public IDs and believed they were the answer for a long time. Then I
helped make XML and decided that public IDs were bogus--that system IDs,
especially in the context of a Web-type resolution infrastructure, were
all you needed (by the "do it on the server" argument above).

But in thinking about it more since then, I think that, while Tim is
technically correct, I think that the real difference between URNs and
URLs (that is, between names and locators) is one of expectation: if you
see a URN (a name) you expect two things: that you will have to go
through some sort of indirection mechanism to resolve it and that it
will probably resolve to the "correct" thing whenever you do choose to
resolve it, whereas with a URL you expect that you can resolve it
directly but you also aren't surprised if it fails to resolve because
"it's just a locator".

[Side note: I think what's bogus about public IDs is not that SGML and
XML provide both a name and a locator but that you can use *both* in a
single reference--that's the bogus part. There should be a single
external reference, where the syntax of the reference lets you
distinguish names from locators--that is, URIs got it right.]

>From the standpoint of publishers, I think there is value in having a
name-based addressing mechanism that matches both the requirement and
expectation that "persistent" names are being used--if I'm publishing a
scholarly work that I want to be findable and usable (through any
references it makes) 5, 10, or 100 years out, I want some assurance that
the names I'm using will resolve appropriately in the future. If this
resolution is being managed by a disinterested 3rd party (e.g.,
CrossRef) I'll probably have greater confidence than if it's being
managed by the enterprise that happens to be serving the named things at
the moment (if for no other reason than that my experience with the Web
suggests that Web sites tend to be poorly managed over time, eroding my
confidence in Web sites generally).=20

What I think this comes down to is that by exposing and centralizing the
indirection mechanism (CrossRef instead of a bunch of redirect tables on
a bunch of Web servers) it provides a single point of contact for both
creating and maintaining the indirections by resource managers and
finding and verifying references by resource users.

I think part of the problem with the URL-only argument is that URLs are
unavoidably bound to particular domain names, which are too bound to
brand identity (even though the URL spec says that URLs are to be
opaque). Thus, the temptation to move resources from one domain name to
another as brands evolve or ownership changes is too great. When a
resource moves from one server to another with a different name it can
be very difficult to find that resource's new location. It's unlikely,
especially when ownership changes, that the old server would be
maintained in order to provide redirections to the new server.=20

Interestingly, the DOI syntax avoids this problem to some degree by
making the owner identifiers opaque as well--very clever I think, and
possibly key to making DOIs truly persistent. That is, the DOI
recognizes that the "owner" of the name is not necessarily the owner of
the intellectual property named, just the manager of the name itself
(that is, the manager of the name's mapping to addressible resources).
There's no need for brand identification of name managers--that's just
plumbing, after all. (But note that the DOI mechanism can't restrict
what you use for resource identifiers in a DOI, so there's still room
for brand identifiers in DOIs, for example, if you use an existing URL
as the resource ID part of a DOI.)

Therefore, I think that the "do it all with URLs" argument is naive--it
ignores human nature. A centralized indirection mechanism at least
lowers the long-term cost of maintaining the resolvability of names.
Once the setup cost has been spent, there's no reason not to use the
indirection unless the ongoing cost is prohibitive. Ideally this type of
resource would be a public utility as DNS is today.

OK, I've convinced myself that URLs are not enough.

Cheers,

E.
--=20
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX  78752 Phone: 512.656.4139
_______________________________________________
topicmapmail mailing list
topicmapmail@infoloom.com
http://www.infoloom.com/mailman/listinfo/topicmapmail