[topicmapmail] Topic Map Links into XML
Murray Altheim
m.altheim@open.ac.uk
Fri, 31 Jan 2003 15:01:00 +0000
Alexander Johannesen wrote:
> "Pascal Reusch" <reuschp@web.de> wrote:
>
>>Isnt there any other way to get around the problem without
>>manipulating the source XML File?
>
> As with most things; that depends. In this case, it depends on the
> source XML and if it contains specific structure to identify what
> you want to address, and how far you want to stretch this in terms
> of a) what technologies involved, and b) the complexity of doing it.
Certainly, though given the ubiquity of the xml.apache.org code,
these issues are dropping more and more into the background, i.e.,
once you have access to the document, parsing it, analysing it,
manipulating or extracting content from it is less an issue.
There is one thing I don't think has been mentioned that I'm sure
Steve Newcomb would if he were reading this: an XTM document is not
a topic map, it's a serialization of a topic map. An individual
<topic> element does not represent a topic, it represents a
serialization of one. In order to correctly process a topic, it
must be the result of correct topic map processing, merging
behaviours must be taken into account, etc. All this is still
being worked out by ISO -- the XTM spec is only (sadly) a temporary
document as regards processing. [I hope the new version isn't
impossible to implement.]
Kal's TM4J is an ambitious project to provide a Java-based framework
for that kind of processing. Unless the XTM document is a representation
of a *consistent* topic map (i.e., all merging and processing as per the
XTM spec has been completed), you can't directly manipulate the
document using the XML serialization, eg., with the DOM or SAX.
> You could be tempted to use the xlink:role for an XPath expression
> in combination with the xlink:href, but XTM doesn't support this
> subset of XLink, and the XLink didn't really have this use in mind,
> but it is a CDATA field with that *possibility*.
It's not quite as simple as that. I was the one who did the initial
design of the XLink part of XTM, and I read very carefully over the
XLink spec and had a number of discussions, both in person and in
email with various people on the W3C linking WG, including Eve Maler.
The problem is that in XLink, linking structures can't be deep; the
relationship between linking components in XLink can only be direct
children of their parent links, not deeper. There was no way to use
an extended linking structure in XTM unless XTM had no significant
linking structures deeper than direct parent-child elements, and so
we in the end were forced to use simple XLinks, which means we lost
about 90% of the cool stuff in XLink. For more info, see
http://www.w3.org/TR/xlink/#extended-link
http://www.w3.org/TR/xlink/#simple-links
The key phrase here is "If a simple-type element contains nested
XLink elements, such contained elements have no XLink-specified
relationship to the parent link". In a nutshell, we had a choice:
make up our own linking syntax, use the XLink simple link, or
imply relationships in a more complex (extended) XLink structure
that weren't supported by the XLink spec. I would have liked to
have the entire XTM <topic> structure be some sort of extended
linking structure, but as I mentioned above, deep structures
aren't supported.
I would have really liked to have included xlink:role within XTM,
but unfortunately according to the XLink spec it would have had no
definition within XLink. We could have gone ahead and ignored the
XLink spec on this, and maybe we should have.
I consider this lack of design flexibility a failure in XLink,
but I won't bore you with the goofiness of W3C internal WG politics
and the difficulties in delivering the XLink spec, suffice it to
say it was a mess and took several years.
> You could, feeling brave, put simple XPath expressions in your
> resourceData (or pointers to them) and parse the source XML with an
> XSLT template using the resourceData XPath expressions to locate
> your data;
>
> XML:
>
> <root>
> <data>one</data>
> <data>two</data>
> <data>three</data>
> </root>
>
> XTM (simplyfied):
>
> <topic id="the_second">
> <topicRef xlink:href="source.xml" />
> <resourceData> /root/data[2] </resourceData>
> </topic>
That should probably be:
<resourceRef xlink:href="/root/data[2]" />
[not addressing other issues here]
> XSL (simplyfied):
>
> <xsl:template match="/">
> <xsl:variable name="go"
> select="document('topicmap.xtm')//topic[@id='the_second']/resourceData" />
> <xsl:value-of select="$go" /> <!-- Observe: not XSLT 1.0 -->
> </xsl:template>
>
> And parse the XSL with the XML as input.
>
> Now, mind you, this is in theory (and *very* untested; in fact, I know
> it wouldn't work out of the box), because XSLT 1.0 doesn't support
> dynamic XPath expressions, so you need to find yourself an extension
> (See http://www.exslt.org/ for more on this) in a parser that supports
> dynamic XPath. Sablotron, Xcerces and XT (I think) all support this in
> some form. I'm sure most do, or you could wait for the XSLT 2.0 spec
> to be supported. (Patience needed, I'm afraid)
I don't think this would work if applied outside of a compliant
XTM processor, as you aren't guaranteed anything post-parse about
the lexical structure of the incoming document. You're operating
at the wrong level -- you need an API such as Kal has provided,
one that should be standardized.
But at a less rigid approach, if you have a consistent topic map
and know enough about the processing environment to operate safely,
you could use XPath expressions within a <resourceRef> element and
have the XPath support in Xalan provide access to the DOM element(s)
of the target.
I'd personally be pretty wary of using an XSLT extension, since
just assuming XPath support is a bit much already. But if you are
working in a closed system this kind of thing works, it's just
proprietary -- you couldn't share your topic maps with anyone
else.
> Anyways, this is all "writing while thinking", so apply the normal
> grain of salt for taste. :)
There's a lot of salt all around in a new field, so you might
want to get fries with that too.
Murray
......................................................................
Murray Altheim <http://kmi.open.ac.uk/people/murray/>
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK
"In Las Vegas Mr Gates also demonstrated a prototype
fridge magnet which can be programmed to receive traffic
reports, sports results and advertisements from local
restaurants using the same FM signal as the wristwatch."
-- The Guardian, 10 Jan 2003.