[topicmapmail] Subject Identifiers metadata

Thomas B. Passin tpassin@comcast.net
Sat, 01 May 2004 11:38:43 -0400


Dan Corwin wrote:

>> Can you give an example of the need to reify the topic?
> 
> 
> Sure.  I can imagine whole classes of topic maps that try to describe
> good or bad modeling techniques - just as this thread does on a "best
> practices" issue.  And which therefore would need to reify specific
> topics as examples, then add characteristics saying that they were
> good or bad illustrations of technique, and explaining why.  Any
> "basic training manual" topic map on the Topic Map paradigm might
> need similar examples of topics.
> 
> Tom suggests PSIs are an answer; for the above they might work.  But
> I think "versioning" or DC-type metadata would be useful for topics
> in any ontology, or in any dynamic TM being built collaboratively by
> a group (such as in a wiki).  Characteristics assigned by reviewers,
> Q/A staff, etc. might also be attached, not to model the subject of
> the topic, but how well it was built.  And here, PSIs seem unlikely.
> 

I think the underlying ideas here have not been worked out yet, and that 
is getting in the way here.  I know that I have not yet thought them 
through in much detail.

I don't agree about "PSIs seem unlikely" in this connection, though.  If 
you make use of an occurrence type, a PSI is the standard way do 
describe what that type is supposed to denote.  Or if you don't want to 
publish them, nothng changes except that they are now more private, 
"Unpublished Subject Indicators", so to speak, but the mechanism is the 
same.

> Nearer to home, suppose I want a servlet to map any English grammar
> form into a topic modeling its referent.  Textual ambiguities cause
> multiple topics to emerge as competing meanings.  So I would like
> to reify each of these "meaning" topics in XTM output, then tag on
> metadata to model its source, plausibility, competitors, etc.
> 

The best approach, IMO, is to first convert the natual language 
constructs into Conceptual Graphs.  Conceptual  Graphs are ideal for 
this purpose, and they are easy to comprehend and were specifically 
designed to translate natural language statements into a formal logic. 
I consider Topic Maps to be essentially a subset of Conceptual Graphs 
(with a few additional wrinkles).  Some CGs can be expressed as TMs and 
some cannot.  The ones that can be so expressed are nearly identical 
except for some syntax details.


> Philosophically, not being able to cite topics as normal subjects
> is very much like blocking all public discussion of an idea or any
> other type of symbol (as distinct from whatever it might signify).
> 
> Unless such discussion becomes possible and reliable, Topic Map
> mutation tools that need to add topic metadata all get crippled,
> and limited in how introspective they can become.
> 

However, I think that you will find that you will end up wanting to talk 
about more than just the topics in a topic map.  The most interesting 
things to talk about, aside from the mundane facts about who created 
this characteristic when, are about subgraphs - associations and the 
topics playing roles in them.  That is to say, subgraphs.

The RDF folks are wrestling right now with how to make statements about 
subgraphs.  In a CG, you can draw a box around a collection of 
conceptual relations and their topics.  The box is an assertion 
(anything placed on the page in a CG asserted by definition), and it is 
called a "context box".  We need something equivalent for topic maps, 
and you will want it for the kind of things you seem to be getting into.

I suggest that the closest equivalent in TM would be a collection of 
associations, which would be understood to include all the attached 
topics.  For this to be possible, each association would have to have 
its own identifier, which is legal but not often done (TM4JScript does 
assign IDs to associations if they don't have their own, though).

One could scope these concept collections, too.

> In a modeling paradigm like TMs, where sophisticated tools can
> help out, the alternative seems to be a big ontological glitch.
> I hope it gets removed soon from TM engines and future specs.
> 

As I said before, how to do it depends on what is to be modeled.  Here 
are some possibilities, and they are not all equivalent -

1) Annotate specific topics and characteristics.  This would include 
versioning, etc.

2) Annotate a stereotype of a topic construction, to show what kind of 
characteristics it should or should not have for some purpose, or to 
indicate "good" or "bad" design.

3) Annotate specific associations.

4) Annotate a stereotype of an association.

5) Annotate a subgraph.

6) Present a subgraph as an example of a pattern, where the sungraph 
does not otherwise exist in the topic map in question.  That is, it has 
been invented for purposes of illustration.

7) Track the evolution of a design over time.  I think this would again 
have to be talking about a subgraph.

8) Investigate how certain inferences would be changed if certain role 
instances were (temporarily) removed or deactivated or added.

>> The note [in TMDM specs] says that you cannot reify a topic. You
>> can, however, reify any characteristic of the topic.
> 

If it really says that, I think it is an error and I propose to ignore 
this bit - not that I have yet had occasion to reify a topic.  I say 
that if it has an id, it can be reified, if only to allow for the kind 
of things we have been talking about in this thread..

> Characteristics normally have no conflicting "identity" issues in
> their own right.  Topics do; and so do your subject identifiers.
> And that is precisely why I linked up these issues in my reply.
> 

Again, if you think of a particular subject identifier as a notion in 
its own light, then you can create a topic to represent (the idea of) 
that identifier.  The type of the topic specifies that it is intended to 
represent a subject identifier, the value of the subject identifier 
could become the subject indicator or be captured with an occurrence, 
and if you want share the definition of this new topic type, publish a 
PSI for it.  Doing all this is not the same as reifying a topic, 
although not too much different.

> In both cases, the identity of some *symbol*, versus that of whatever it
> symbolizes, is the true origin of the conflict.  The TMDM note appears
> to deliberately recognize only the second of these options.  And in its
> definition of <topicRef>, so does XTM 1.0.  Why block the first option?
> 

Make sure not to mix up the idea of reifying a subject indicator with 
the idea of referring to a specific topic.  Also, in xtm 1.0, 
subjectIndicatorRef would seem to be the thing to use rather than 
topicRef, which you rightly point out is not suitable for reifying a 
topic.

> My concern here is really the same one several folks on this list have
> raised about ambiguity within RDF symbols: do they identify a resource
> (like a topic), or something else which that resource symbolizes?
> 
> TM's have *externally* solved such conflicts (with web-wide gloating)
> by adjusting the syntax of XTM to handle both cases for a URI:
> 
>   <resourceRef>
>   <subjectIndicatorRef>
> 
> I would hope TMDM can now spec analogous distinctions between the two
> *internal* cases.  Implemention-wise, I'd guess it really takes only a
> new boolean feature (or bit) to discrimiate saved "reifier" pointers,
> plus some additional syntax at the XTM level to let it be persisted.
> 

Now that I am starting to have some more time, I really have to start 
getting familiar with TMDM.


Cheers,

Tom PO