[topicmapmail] Fragmented XTM for web metadata, and some ontology?
Murray Altheim
m.altheim@open.ac.uk
Sun, 29 Jun 2003 17:55:45 +0100
Kal Ahmed wrote:
> On Sun, 2003-06-29 at 13:39, Murray Altheim wrote:
[...]
>>I understand that. But Kal, describe for me a reasonable approach to
>>allowing arbitrary XML in <resourceData> that doesn't completely screw
>>us in terms of interchange. Once you open that door, there's no closing
>>it. I just don't see how that would make any sense given that the first
>>document coming down the pike with unknown markup (or JavaScript code)
>>is just completely opaque to an application that can't process it, or
>>worse yet, transparent, i.e., the user doesn't even know what is missing.
>
> As I previously suggested, I don't see anything wrong with an
> XTM-compliant parser doing no more than validating the resourceData
> content as well-formed and making it available in the SAM as string
> data. The validation of the resourceData content can be done at a higher
> level. The XTM-compliant parser would be able to flag the string data as
> well-formed XML and then at application level you would either handle
> the XML or not.
The "or not" is the big problem here. Until you can solve the "or not"
problem, you're basically advocating the Microsoft approach: you can
read any document so long as it is a document understood by our software.
I don't want people opening up interchange documents and getting complete
blanks, missing words, misinterpretations, different user "experiences"
based on whether or not their application can grok the ugliness sent by
another application. That's just broken.
>>I argued this same case to the W3C HTML Working Group for years, where we
>>had Dave Raggett and others advocating that we do away with having an HTML
>>DTD at all and just defining things in terms of "tag sets". Had this come
>>to pass, we'd never had an XHTML DTD at all, just some weird notion of a
>>well-formed "XHTML" document where one could intermix anything anyone
>>wanted anytime -- complete freedom, and completely useless to anyone
>>except monopolists who import and export their own brand of proprietary
>>markup muck (export "HTML" from MS Word to see what I mean). If you look
>>at the latest XHTML 2.0 draft [1] you'll see they're still trying to
>>figure out some way of specifying a language without using a schema.
>>[okay, I should have wrapped this in <rant>. I just don't want us to go
>>down that same road.]
>
> But I am not suggesting complete mix-n-match. I am suggesting one
> particular place in the XTM DTD where XML from other namespaces would be
> allowed. There is no question (in my mind) of changing the "XTM on top"
> approach of the DTD that says that the topicMap element should be the
> document element, nor is there any question of allowing other markup to
> appear anywhere else than inside resourceData. I know that other people
> have suggested such changes, but I leave it to them to defend that :)
But allowing arbitrary markup even within <resourceData> means that XTM
applications *all* need to unambiguously and consistently process whatever
that markup happens to be. So let's say, for sake of argument, we allow
a subset of XHTML markup to appear there. We're not talking things that
appear in HTML's <head>, we're not attaching CSS stylesheets, we're not
using <base> or <applet> or <object> or JavaScript or even <table>. Even
with a *really* constricted declared content, we'd still be requiring all
XTM applications to correctly process it, as for those processors that
didn't, things could go completely haywire. Content might be missing,
disappear from the screen, words be stuck together because of missing
implied whitespace (because of now-ignored markup), and the user experience
might rely on that markup to make sense.
It's just an extremely slippery slope. I've thought quite a number of
times about creating a combo-DTD mixing XTM and XHTML, where XHTML is
the document element and XTM appears as a single block at the end.
There'd be a lot of interesting applications for something like this,
but it would ruin interchange of XTM if if actually became popular, as
suddenly we'd have accidentally upped the ante on what was required of
XTM applications, something I've assiduously tried to avoid. I've never
believed the idea that sending around well-formed XML was going to
catch on much, simply because the ground assumption of an application
able to *correctly* process such documents would be enormous, and hell,
the Web community can't even get Netscape or IE to work correctly and
reliably with HTML after many years of trying.
Murray
...........................................................................
Murray Altheim http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK .
"There's a lot of intelligence out there that you don't
know if it's true or not." -- Anonymous US official
http://news.bbc.co.uk/1/hi/world/middle_east/3014850.stm