[topicmapmail] PSIs - alternatives
Simon Grant
asimong at btinternet.com
Fri Jun 23 05:41:00 EDT 2006
At 08:44 2006-06-23, Steve Pepper wrote:
>My main point is that we should be extremely careful about keeping agreement
>concerning the *identity* of a subject separate from agreement concerning
>*opinions* about a subject. We should therefore avoid overloading the PRD
>with assertions, especially machine-processable assertions.
OK.
>However, assertions about the *PRD* (as opposed to assertions about the
>subject it describes) are another matter. I have no problem with either
>human-readable or machine-processable metadata about the PRD, such as its
>publisher, type, date, version, etc. Such metadata can play an important
>role in your 2. ("whatever else may be necessary towards a self-sustaining
>infrastructure and motivation to use it").
>
>Whether references to other PRIs deemed to be equivalent belong here is
>something that needs to be discussed more fully.
(I will use "PSI" for continuity with previous discussions and
terminology, in place of "PRI")
Agreed. To set this in the context of the above position, references
to other PSIs would be limited to being assertions on the
equivalence, or non-equivalence of other PSIs and their related PRDs.
>How would they be used? Certainly not when actually merging topic maps,
>because the whole point of the PRI/PRD mechanism is that computers do not
>need to dereference the PRI in order to ascertain if two things are the
>same: they simply compare strings.
>
>I guess there could be a use for harvesters that go round collecting such
>equivalences in order to build mapping tables, but could the resulting
>tables really be trusted? The minter of PRI "X" might claim that it is
>equivalent to somebody else's PRI "Y", but who's to say whether the minter
>of PRI "Y" agrees with that assertion?
Easily arranged. If there is a reciprocal link, then it can be
trusted (to the extent that anything can be trusted - not
absolutely). You can claim as much as you like, hopefully, that your
PSI is equivalent to my PSI, but what counts is whether I am prepared
to recognise that by including the reverse equivalence in my PRD. A
stronger approach than leaving a PSI out would be to explicitly mark
one's own PSI as different from another PSI.
>At the very least there are issues here that need to be thought through more
>carefully before designing a complete solution. My initial goal with the PRI
>initiative is to go for a less complete but more immediately adoptable
>solution and then see how things develop.
Adoptable is not necessarily identical to effective. I suppose we
want an optimal balance between ease of adoptability and
effectiveness for things people want to do. I'd say this is where
traction comes from.
>| My suggestion addresses both the situations where people won't do
>| the merging, and the situation where alternatives exist prior to
>| merging. Having a list of accepted equivalent PRIs included means
>| that the machine comparison is still relatively easy - it would
>| involve fetching the two PRDs and then string comparison of two
>| lists against each other. That's all.
>
>I certainly think there is a place for services along these lines. I've
>already created one myself: the mapping between ISO 3166 and CIA country
>codes. All I'm questioning is whether it makes sense to include the ability
>to specify mappings in the PRD itself. I want PRDs to be as simple as
>possible, and I want to encourage reuse rather than a free-for-all. That's
>why I suggested a standard way to deprecate one's own PRI in favour of
>someone else's: We would get the mapping you want while at the same time
>making it clear which is the preferred PRI.
Well, deprecation is easy in terms of the format I suggest. Imagine
first a three-part PRD
1. metadata about the PRD
2. human-oriented description
3. list of equivalent PSIs
The convention could be that a PSI includes itself in the list of
equivalent PSIs to indicate that it is to be regarded as current.
Leaving ones own PSI out of the list would indicate self-deprecation:
"don't use me, use one of these other ones".
Straightforward superseding would be by placing exactly one PSI, not
one's own, in the list of equivalents.
I still think one extra part would be optimal:
4. list of PSIs that are explicitly marked as different, if they
were, for any reason, likely to be mistaken as candidates for
equivalency. It would be like saying "I've thought about those ones
and, no, they aren't the same thing".
I also like the idea of "harvesters" crawling the web of connections
to find reliable sets of equivalent PSIs, as well as noting
differences. This could be part of a PSI management tool. It would be
good to collaborate in designing such a tool: checking whether PSI
marked as equivalent had changed their PRD; finding out extra PSIs
that had been added to another PRD, so that they can be presented for
human decision whether to add to one's own set; etc.
Simon
More information about the topicmapmail
mailing list