[topicmapmail] How do you deal with (lack of) association templates ?

Bernard Vatant bernard.vatant@mondeca.com
Wed, 12 Jun 2002 09:46:02 +0200


Lars Marius

I was sure you will be the first one to bite :))

> * Bernard Vatant
> |
> | For a paper I intend to present in Baltimore XML 2002, I would like
> | to make a liitle review of how TM thinkers and developers deal
> | currently with association templates - or rather with lack of any
> | standard for that matter.

> * Lars Marius Garshol
> When you say "association templates" it is not clear what you mean, in
> the sense that it's not clear what problem you think "association
> templates" solve.

I was not clear maybe just to figure what everyone was thinking about when speaking of
templates.
Let me be more precise: what I mean by that is the pattern of roles.

For example, an "employment" template would be (whatever syntax):

template (employment) = [employer (1), employee (1+)]

That means that an association with type "employment" has exactly one topic playing the
"employer" role and one or more topic playing the "employee" role.

The use of identifying such a template would be to be able to merge e.g.

-- employment : employer (X), employee(Y)
-- employment : employer (X), employee(W, Z)

with X having the same subjectID in the two associations

into the following

-- employment : employer (X), employee(W, Y, Z)

Which is very natural. You welcome new members in your association, you don't create a new
one.

> | "employment", "employee" and "company" are defined using the same
> | PSI respectively.
>
> Do you mean that these topics have the same PSI in the different topic
> maps, so that upon merging they become the same topics?

Definitely.

> | What is the result of merging of A and B ? How many associations?
>
> This is described both in XTM 1.0, Annex F, and in the Standard
> Application Model. Are you asking how to interpret those, or is there
> something more to your question that I don't grasp?

> The two associations are not equal, since the last role player is
> different (assuming the two X-es share something that makes them
> merge), so there will be two associations.

That's what is annoying. So if you have 50 employees for the same employer, gathered one
by one in a workflow, you'll get 50 associations ... too bad.

> | Idem for A and C, then A and B and C ...
>
> In none of these cases will any of the associations be removed.

That's the point. This is not scalable in a workflow environment. The topic map will get
rapidly cluttered with redundant information.

> | What process would you imagine or have you already figured to get
> | rid of redundancy and reduce the number of associations and roles in
> | the merged maps to the minimum in such cases?
>
> Well, Bernard, you have to distinguish between what the standard
> requires, and what an application can do to help users. The first you
> must do, and everyone must do it the same way. The second you can do,
> provided the user asks you to (and perhaps even provides you with
> extra information).

Since there is nothing in the standard about it, I'm just curious to know how applications
and developers deal with it or at least think about it ... hoping some day it moves up to
a standard. That's my point

> I would deal with this as follows:
>
>  - in the cases A-B, A-C, and B-C it is clear that the associations
>    have the same structure, so that we should afterwards end up with
>    two associations:
>
>      employment(X : employer, Y : employee)
>      employment(X : employer, Z : employee)

As said above, I would like to have only one association here ...

>    You'd have to explicitly assert that "company" and "employer" are
>    the same thing, and perhaps remove the name "company" from the
>    merged topic, but that's easy.

Yes. Or given that "company" being NT for "employer", keep "company" in the final
association.

>  - D is different, and clearly requires structural transformations.
>    Our goal is to let the query language deal with that, but for now
>    we would use the API to implement the transformation. It's quite
>    easy, actually.

But there are so many ways to do it ... My point is: how would you set some general rules
for that? I don't really figure it has to do with the QL. It's a question of graph
reduction. Do you think graph reduction is in the scope of TMQL?

> I would not expect to be able to do this automatically, without any
> form of human intervention except where the standard lets you do that,
> unless you have safe heuristics you can apply (like the TNC), which
> you usually don't.

Well. I really think about workflow and automatic process in a closed environment, and yes
with TNC applying. In the use case I think about, the topic map repository is updated
either by integrated authoring tool or text/metadata mining tool on semi-structured
documents, in any case using a controlled vocabulary, able for example to extract
"employer-employee" relationships and add them on the fly to the topic map. In that case I
don't want human intervention in the transformation, and I want e.g. every new employee to
be added as a member to existing association if any, and not have as many associations as
employees.

This is a very simple customer requirement, and if we do not meet it, we are stuck. My
concern is to know if that is to be let to each developer and use case, or is some
standard effort is possible on that.

Hope that helps to precise the context.

Bernard

-------------------------------------------------------------------
Bernard Vatant
Consultant - Mondeca
www.mondeca.com
Chair - OASIS TM PubSubj Technical Committee
www.oasis-open.org/committees/tm-pubsubj/
-------------------------------------------------------------------