[topicmapmail] Creating Topic-Map-View of relational data
Jan Algermissen
algermissen@acm.org
Fri, 26 Nov 2004 12:37:30 +0100
"Dipl.-Wirtsch.-Inf. Lutz Maicher [Universität Leipzi g]" wrote:
> I want to
> "transform" existing relational data (always the same) in different Topic
> Maps (Views) with different structural attributes. From my point of view, I
> only want to "parameterise" the transformation to get a new test series for
> my empiricism.
Lutz,
is your relational data in 3NF (or 5NF if you have multivalued keys)?
If so, every row in a given table represents a single subject and you
could simply return (generate etc) a sinlge topic per row. The attributes
that are no foreign keys map to topic attributes (aka properties) directly.
For relations that contain foreign key attributes, you'd have to return
associations (if you stick to TMDM, you'll also need a 'reifying' topic
item to attach the non foreign key properties to).
This approach can also be done automatically, either by parsing the
schema or by querying the database metadata, e.g. 'show columns from table;'
(in MySQL).
You could generate PSIs (e.g. text pages) from the rows and parameterize the
combination of attributes you stuff into the text file.
OTH, you could of course calculate a 'row equivalence value' for rows from
different tables/databases directly, without the detour through Topic Maps....
HTH,
Jan
>
> * Lars Marius Garshol
> > a) how directly should the resulting topic map reflect the
> > underlying RDBMS? (Ie: do you expect to do normalization or
> > create a simplified view; do you want to do string processing on
> > values in the database, etc?)
>
> The Topic Map doesn't reflect the structure of the underlying RDBMS. I want
> to extract data from the RDBMS to create Topics which reflect my one,
> arbitrarily defined Subjects (i. e. I may have a table "author" in my
> database, but for my empirical tests I create an Occurrence of type "author"
> for each book which contains the name of the author).
>
> Additionally I think, that copying all needed data from the RDBMS into my
> Topic Map is sufficient for my purposes.
>
> * Lars Marius Garshol
> > b) are you looking for a procedural approach (do this, then do that,
> > then do like this) or a declarative approach (this table is a
> > topic type; this table an association type; ...)
>
> Thats a good question. I foresee the following (naive) process which
> consists of:
> 1. SQL statement --> returns a result set of some columns of two types (by
> virtue): Subject Identity Columns and Value Columns
> 2. Define for each entry of a Value Column the Subject Identity of the Topic
> (and the Topic Characteristic) where it belongs to.
> 3. If this Topic (or the regarding Topic Characteristic) doesn't exist ->
> create it
> 4. Append the value to the Topic Map
>
> Example:
>
> SELECT ISBN, title FROM ... WHERE ....
>
> Subject Identity Column: ISBN
> Value Column: title
>
> [addValue(TopicIdentity, TopicCharacteristic, Value)]
> addValue(ISBN, topicName, title)
>
> But this is only a very vaque idea. I'm open to all expertises which
> simplify that mapping.
>
> [One remark: I might be a bit confusing that on the one hand I negate the
> existence of "sharp" Subjects and on the other hand I want to use something
> like "Subject Identity Columns" to define the Subjects in my Topic Map
> Views. This is due to the fact that for the empirical tests I always need an
> objective criterion to decide whether to Topics represent the same Subject.
> That means, I decide with the help of the Subject Similarity Service whether
> two Topics *might* represent the same Subject (of course without using
> something like the ISBN to decide the similarity). After that I use the ISBN
> to calculate precision, recall etc.of the current test serie].
>
> I'm looking forward your responses!
>
> Greeting from Leipzig
> Lutz
>
> References:
> [1] http://www.informatik.uni-leipzig.de/~maicher/forschung2.html#[maic04b]
> [2] http://www.informatik.uni-leipzig.de/~maicher/forschung3.html#[maic04d]
> [3]
> http://www.informatik.uni-leipzig.de/~maicher/forschung1.html#[asvWS0405]
> [4]
> http://www.idealliance.org/papers/extreme03/html/2003/Kent01/EML2003Kent01-toc.html
>
> _______________________________________________
> topicmapmail mailing list
> topicmapmail@infoloom.com
> http://www.infoloom.com/mailman/listinfo/topicmapmail
--
Jan Algermissen
Consultant & Programmer
http://www.jalgermissen.com