[topicmapmail] Generating TMs out of relational Databases / How To?

Bernard Vatant bernard.vatant@mondeca.com
Tue, 31 May 2005 11:24:38 +0200


Hello Andreas

I second Alexander and singularly when he writes

> The very first thing you need to sort out is your ontology,
> meaning "what are the things you want to talk about".

I would like to explicit a point which has been underlying this debate so far. When you
want to migrate a data legacy, whatever its original format, data base or anything else,
into a TM, there are two basically different answers to "what are the things you want to
talk about".

Either you have a strictly data-centric viewpoint, which means the target TM will be a
description/transformation of the data you have, and not of the "things data are about" -
whatever those are, you'll stay agnostic about it. From this data-centric viewpoint, you
can set some kind of automatic rules, like "A row is represented by a topic, a column by
an occurrence" and so on. You don't really think in terms of ontology in this case. It's
just a stylesheet you apply to your data, or, as Nikita would say, you look at your data
"with TM glasses". Might be useful in some cases, I don't think it's yours.

Or - and this is certainly your case and the most frequent one - you consider your data
base as an implicit description of  "things in the world", "business objects", "domain
objects", whatever, and you want you TM to explicit what those things are (types,
properties, relations) - this is ontology engineering, and well, there is no silver bullet
for that. If those things have not been explicited in the data base model, there is no way
actually, to "extract" them from the data structure, you have to figure them out by any
means : you keep an eye on the domain you want to represent, and the other on the data you
have, and at some point you begin to think about your target representation system (TM)
and also about technical implementation, interface requirements, type of queries you want
to perform, and the like. So you will have to answer questions like : What is the point in
having associations vs occurrences? What is the point in having names vs occurrences? Do I
want to handle n-ary associations, or do I want to handle only binary ones? Is the
vocabulary used in the data base to be kept, or do I want to use a more controlled one, or
a more user-friendly one, etc.  and none of those questions have an unique answer, as
Alexander points out. It's engineering, and there is no "right" or "wrong" answers in
engineering, only solutions that are more or less efficient.

Unless you believe your TM at the end of the day will achieve a "realistic" description of
the "real" domain you deal with. In this case be prepared to endless religion wars about
what is a "right" or "wrong" representation.

See e.g. this paper by Christopher Menzel I just discovered yesterday, about "realistic"
and "pragmatic" approaches of ontology.

"Reference Ontologies - Application Ontologies: Either/Or or Both/And?"
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-94/ki03rao_menzel.pdf

Hope that helps

Bernard


**********************************************************************************

Bernard Vatant
Senior Consultant
Knowledge Engineering
bernard.vatant@mondeca.com

"Making Sense of Content" :  http://www.mondeca.com
"Everything is a Subject" :  http://universimmedia.blogspot.com

**********************************************************************************

> -----Message d'origine-----
> De : topicmapmail-admin@infoloom.com
> [mailto:topicmapmail-admin@infoloom.com]De la part de Alexander
> Johannesen
> Envoyé : mardi 31 mai 2005 02:44
> À : Andreas Fleck
> Cc : topicmapmail@infoloom.com
> Objet : Re: [topicmapmail] Generating TMs out of relational Databases /
> How To?
>
>
> Hi there,
>
> On 5/30/05, Andreas Fleck <AndreasFleck@gmx.net> wrote:
> > but how do I handle the columns? I´ve read a paper from ontopia,
> > which says that columns should be mapped as occurances.
>
> Well, it is "Yet Another Way Of Doing It" (TM). Within Topic Maps
> there is no one set way of doing these kinda things.
>
> > But i got a problem in figuring out how to do this exactely. I mean If i
> > would have a column entitled "name" in a table called "address", it isn´t
> > what i would map as an occurance within a topic map, as "name", to me is a
> > seperate topic to me, not an occurance (which is mostly a website or picture
> > in the papers i´ve read through).
>
> Again, it depends. :) I'm sorry the answer to these things isn't "do
> X". The very first thing you need to sort out is your ontology,
> meaning "what are the things you want to talk about". Here's perhaps
> an example of such (basic types), so create a topic for each "type" ;
>
> * Person, * Tool, * Project
>
> and perhaps some occurrence types ;
>
> * Address, * Phone, etc
>
> Next up is defining the relationships in your map. You've got these in
> one of your tables, so for each of them, create a topic from them;
>
> * works-on, * belongs-to, * does-X, * performs-Y, * handles-Z ... and so on
>
> Now you can simply do;
>
> SELECT * FROM Person
>
> And with this result, do for every row depending on what each column is ;
>
> Topic "{$name}"
>    instanceOf #person
>    occurrence type #address "{$address}"
>    occurrence type #phone "{$phone}"
>
> and so on. The same for each other table.
>
> > > Here's my normal process ; map everything that matches a given
> > > ontology to an intermediate XML, then use XML tools to normalise,
> > > group and sort it into an XTM file, and then import this XTM file into
> > > your TM engine of choice.
> >
> > Oh, sorry to ask. But what do you exactly mean by mapping it to an
> > "intermediate XML"? What is this?
>
> Well, when you export all your tables into a topic map or anything,
> depending on what sort of topics you want to create when the topics
> aren't something you've typed in your ontology, then you need a way to
> create new types. I do this through an intermediate XML, a temporary
> XML file, basically. Say you have two different columns in your DB
> that represents something you want to create types from; you export
> out topic types from both, but perhaps there are overlaps? Then you
> need a way to sort out and group these topics, a kind of a normalised
> merge.
>
> > What can i understand exactly about the process of "normalising", "grouping"
> > and "sorting" it into an XTM file?
>
> It depends on your process for creating the Topic Map. If you're using
> Java and JDBC and TM4J, I guess you can normalise (meaning; filter
> your data, get rid of crap, keep the good stuff) your DB objects
> directly. I personally use a lot of XML tools for this, so I create
> some XML files that represent raw data, the do grouping and sorting on
> them to create XTM files which I then import into whatever.
>
> ...
>
> > i´ve posted a more specific describtion of my problem and scenario as an
> > anxwer to Murray´s kind post and i´m pretty glad, having found a list like
> > this, since none of the guys here at my place has a clue of topic maps ;-)
>
> Hehe, clues about Topic Maps are far and wide apart, no matter where
> you look. :)
>
>
> Alexander
> --
> "Ultimately, all things are known because you want to believe you know."
>                                                          - Frank Herbert
> __ http://shelter.nu/ __________________________________________________
> _______________________________________________
> topicmapmail mailing list
> topicmapmail@infoloom.com
> http://www.infoloom.com/mailman/listinfo/topicmapmail
>