[topicmapmail] Use and abuse of occurrence
Jan Algermissen
algermissen@acm.org
Tue, 02 Dec 2003 18:59:59 +0100
Gary Pupurs wrote:
> There are too few examples available done with
> quasi-real-world data, and most of the XTM maps that are public are too
> small of a dataset to really begin discovering the nuances of implementation
> problems. The simple examples also make it hard to identify any best
> practices that come into play when making hard modelling decisions about
> real world data. More examples of the design decisions behind the choices of
> association types and topic types for representing complex, but extensible
> data would be helpful.
>
> As far as I know (correct me if I am wrong), the largest topic map sample
> data available is Ontopia's opera.xtm, and that is only 526K, with 6402
> topics, 840 associations, and 361 occurrences. An excellent example, to be
> sure, but still fairly small in the grand scheme of datasets.
Gary--
I have quite a large one at:
http://www.topicmapping.com/maps/cpan/master.xtm
look at http://www.topicmapping.com/maps/cpan for
the various 'modules' and their sizes.
The map is propably not very interesting in terms of
association types or scopes etc. but see yourself.
There is an introduction to the map at
http://www.topicmapping.com/cpan/
but some stuff/links in there maybe outdated.
> I'm starting to build some larger topic maps (300-600 MB of raw data) from
> publicly available datasets, just to have a realistic basis to begin
> exercising some of my own coding experiments and ideas and identify
> potential pitfalls, before attempting to transition our traditional database
> content to a TM-based app. (I'll release these if they work out, hopefully
> with some schema design decision documentation, if I follow my own demand
> for more examples!)
Interesting to hear that you aim at substantial dataset sizes. Regarding this
thread, I am very curious what you think about doing properties as occurrences
once you get to that point. Please let me know.
Any idea what the nature of your datasets will be?
Jan
>
> -g
>
> _______________________________________________
> topicmapmail mailing list
> topicmapmail@infoloom.com
> http://www.infoloom.com/mailman/listinfo/topicmapmail
--
Jan Algermissen http://www.topicmapping.com
Consultant & Programmer http://www.gooseworks.org