[topicmapmail] Use and abuse of occurrence
Kal Ahmed
kal@techquila.com
02 Dec 2003 17:58:34 +0000
On Tue, 2003-12-02 at 17:44, Gary Pupurs wrote:
> Kal wrote:
> > I would also mention here that in my experience most developers won't
> > ever read the model document once we as a community have developed a
> > common programming interface (see http://www.tmapi.org/).
>
> Thanks for the link, I hadn't run across that before. What's the current
> implementation status with TMAPI? Has anyone added the API to their tools?
>
TM4J implements TMAPI. At least one commercial topic map engine company
have expressed interest in implementing it too.
> > That was one of the issues I wanted to address by promoting topic map
> > design patterns. I agree with you that most developers work by the "View
> > Source" method. I would just like there to be more source to view and
> > more descriptions of why the source is the way it is. There is an
> > education issue here. To my mind this is actually far more important
> > than any amount of discussion of properties/occurrences/facets etc.
> > Maybe someone (or many people) would like to contribute some topic map
> > meta data patterns to www.topicmapcentral.com :)
>
> I strongly agree with you. There are too few examples available done with
> quasi-real-world data, and most of the XTM maps that are public are too
> small of a dataset to really begin discovering the nuances of implementation
> problems. The simple examples also make it hard to identify any best
> practices that come into play when making hard modelling decisions about
> real world data. More examples of the design decisions behind the choices of
> association types and topic types for representing complex, but extensible
> data would be helpful.
>
Definitely.
> As far as I know (correct me if I am wrong), the largest topic map sample
> data available is Ontopia's opera.xtm, and that is only 526K, with 6402
> topics, 840 associations, and 361 occurrences. An excellent example, to be
> sure, but still fairly small in the grand scheme of datasets. I'm aware
> that several companies have implementations with thousands or even millions
> of topics, but most of these large topic maps appear to be proprietary at
> the moment.
>
A while ago I released a topic map generated from the UNSPSC (Universal
Standard Products and Services Classification). Its available from
http://www.techquila.com/tmsamples/xtm/unspsc/unspsc_11.zip see
http://www.techquila.com/tm-samples.html for details on how it was
created (Python + XSLT).
> I'm starting to build some larger topic maps (300-600 MB of raw data) from
> publicly available datasets, just to have a realistic basis to begin
> exercising some of my own coding experiments and ideas and identify
> potential pitfalls, before attempting to transition our traditional database
> content to a TM-based app. (I'll release these if they work out, hopefully
> with some schema design decision documentation, if I follow my own demand
> for more examples!)
>
That would be great to see!
Cheers,
Kal
--
Kal Ahmed, Techquila
Standards-based Information Management
e: kal@techquila.com
w: www.techquila.com
p: +44 7968 529531