[topicmapmail] Expressive capabilities of Topic Maps

Thomas B. Passin tpassin@comcast.net
Mon, 8 Sep 2003 20:17:47 -0400


[<jalgermissen@topicmapping.com>]>

> excuse any provocative language below...I simply want to get
> people thinking about this issue.
>

Sure... not that I have all the answers ...
>
> "Thomas B. Passin" <tpassin@comcast.net> schrieb am 09.09.2003,
> 00:00:28:
> > []
> >
> > I try not to think of TM and RDBMS as competitors.  First of all, if you
> > have regular data
>
> What do you mean by *regular data*?
>

I meant that most or all cells in each table would contain data (with a some
NULLs, no doubt), and that there would be only a relatively small number of
association types - n other words, you would not be making up new
association types all over the place to accomodate peculularities of the
data set.

> that is well-modeled by a table structure, you should use
> > a relational database, since they are highly evolved and efficient at
> > handling that kind of data.
>
> So, what data is 'well modeled by a table structure'?

As above, and also something equivalent to a primary key, preferably atomic
(but that is not essential); a limited number of foreign keys; a consistent
(and known) set of integrity constraints (remember, we are imagining a
future in which we can easily do CRUD and transactions with topic maps).  A
lot of this would automatically come out of good data modeling practices
even if you plan to use topic maps.

>
> What is the nature of data where we profit from using topic maps?
>
>
> In this case, a topic map overlay can be very
> > useful, either for integrating several databases or for providing
navigation
> > for the database.  No need for competition here.
>
> I am not thinking about using topicmaps as an add-on! I am thinking
> about reasons to store data as a topic map in the first place.
> [Note: topicmap != topic map document!]

Right, and I was too.

> >
> > Second, it can happen that a relational database is designed (more or
less)
> > according to a topic map design, even if it does not implement
everything in
> > the TM standard(s).
>
> Yes, can this happen? Can you give an example? Can we store data in
> an RDBMS and support the merging capabilities of topic maps?
>

Aha!  I think that merging would be somewhat problematical for relational
databases, unless one adopted some strong conventions whereby identity
between different primary keys could be assured. Of course, RDBMSs do not
natively support TM-style merging so you would have to write code to do a
merge.  But you might have to do that with a TM, too, depending on your
software and the kind of merging you would want to do.

> In that case, the database __is__ a topic map, to all
> > intents and purposes.  You may want to put a topic map wrapper over it
for
> > data interchange purposes.
>
> Do you mean storing TMs with an RDBMS backend? Of course this works, but
> that is not my point.

No, I did not mean that.  I suggest the following way of mapping a
relational database to a topic map.  I am referring to a well-normalized
relational design - say at least 3rd normal form.  A "join" table - what
Mike Gorman calls an "Intertwinkle" table - is essentially isomorphic to a
TM association.  Column labels are equivalent to role types.  The foreign
keys point to the role-playing topics.  A row is equivalent to a specific
association.  The association type is equivalent to the table type.  There
are no scopes per se, but to some extent they could be fudged in using WHERE
conditions, I suppose.

An "ordinary" table - not a join table - could be modeled either as an
association or as a topic with occurrences.  In my opinion, it is most like
a topic with a set of occurrences. Each occurrence type corresponds to a
column label.  Each row corresponds to a topic.  Each primary key - I am
assuming atomic primary keys - corresponds to the topic id.  A basename has
no real equivalent, but could become simply another column (or more) in the
table.

This plan has the advantage that it is simple, has low overhead, and that
the data values are literals (or references to some other resource), which
is the case with the table cell values.  That is why I say this is an
appropriate mapping.  It has the disadvantage that, if you wanted to
represent two or more separate tables, they would all coalesce into the one
topic and not be easily distinguishable afterward (although with the proper
schema you could take care of that).

Murray Altheim wants to have each data value be its own topic so he can
associate to it to his heart's content - think lots of meta data - and
that's fine, I am just looking at the near-isomorpism between TMs and
relational database design,which is a little different problem.

Now suppose  you have a data model that does not fit this plan well - then I
assert that it is not a good candidate for modeling with a relational
design - and conversly.  A lot of special cases, different topics of the
same type type having quite different occurrences or associations - cases
like these would be good fodder for topic maps as opposed to relational
designs.
> >
> > Third, if your data is more like a set of sparse tables, or is irregular
in
> > its structure, topic maps are likely to work much better than an
ordinary
> > relational database
>
> Yes? Why?
>

Because you would have to keep making up new tables that were used only for
a few of the subjects, and similarly for columns on a table.  This is awful
to work with in a relational database.

> So, the data I need to store being "like a set of sparse tables" or
> "irregular in its structure" would be the criterion to choose topic maps
> instead of an RDBMS?
>

A need to merge could be another criterion suggesting using a TM, as could
the use of scopes.  Also, if you need the full range of options available in
XTM (three kinds of subject indicators, for example), it gets pretty tedious
to implement and query in a relational database.

Another point is that topic maps can be seen as an abstract pattern for
modeling - the pattern constrains you, but it is also freeing because you
can follow the pattern isntead of inventing everything anew each time.  It
is especially good for indexes and suchlike, and for many-to-many
relationships, so situations that need to be modeled like them then to work
well as topic maps.  Now if you happen to use a relational database to
implement it, well, you are still following the topic map pattern.

Another strength of topic maps is that it is easy to remodel and change or
add to the design.  It is much harder with a relational database.  On the
other hand, because of this a relational database can bring more stability
to the data, which is generally a good thing.

> Could you give examples for such data?
>
>
Not tonight ...

Cheers,

Tom P