[topicmapmail] Topic Maps and Natural Language processing

Steve Pepper pepper.steve at gmail.com
Fri Sep 21 12:26:47 EDT 2007


* Chattun Lallah wrote
|
| I am trying to represent an event which involves some
| natural language constructs into topic maps. I have not
| been able to find a solution how to represent language
| semantics.

This is an interesting question that deserves more research in
the Topic Maps community. (Idea for a masters thesis, perhaps,
or a presentation at TMRA?)

The short answer is that it *can* be done, but it would probably
be worth developing conventions based on accepted principles of
linguistic semantics, such as thematic roles, etc.

Let me try to walk through how it could work. Here's what we
want to represent as a topic map:

| "Number 10 denied that Gordon Brown exploited Lady Thatcher for
| political ends by inviting her to Downing Street."

The core semantic relation that needs to be expressed here is
the (alleged) exploitation of her Ladyship by GB. So there are
two primary actors, for which we define topics (of type person):

  /* LTM fragment. Can be loaded in the Omnigator */
  [person   = "Person"]
  [brown    : person = "Gordon Brown"]
  [thatcher : person = "Lady Thatcher"]

The semantic relation is exploitation, which we define as an
association type with two corresponding role types. Here I use
fairly generic role types which I also used for the Killed by
relationship in the Italian Opera Topic Map:

  [exploitation = "Exploitation"
                = "Exploited by" / victim
                = "Exploited" / perpetrator] /* association type */
  [perpetrator  = "Perpetrator"]             /* role type */
  [victim       = "Victim"]                  /* role type */

(Note that I give the association type multiple names in scopes
corresponding to the role types in order to get appropriate labels.)
Now we can create an association to represent the core semantic
relation:

  exploitation( thatcher : victim, brown : perpetrator )

Put all that together and you can browse it as a little topic
map in the Omnigator.

Now, I don't know who made this assertion, but it wouldn't surprise
me if it was a certain rag that I personally don't trust very
much. If you want, you can scope the association to capture this:

  exploitation( thatcher : victim, brown : perpetrator )
    / daily_mail

  [newspaper = "Newspaper"]
  [daily_mail : newspaper = "Daily Mail"]

So far, so good. Now we want to say something about the *means*
by which the exploitation took place ("Invitation to Nr. 10"),
so we need a topic to represent that subject:

  [means = "Means"]
  [nr10invite : means = "Invitation to Nr. 10"]

I choose to capture the additional knowledge by adding a third
role to the association (this is a design choice that might be
worth discussing; there are alternatives):

  exploitation( thatcher : victim,
                brown : perpetrator,
                nr10invite : means )
    / daily_mail

We have now captured the full semantic relation as an association.
What we want to do next is annotate it to capture Nr. 10's denial.
To do that we have to *reify* the association, as follows:

  exploitation( thatcher : victim,
                brown : perpetrator,
                nr10invite : means )
    / daily_mail
    ~ thatcher-exploitation

This gives us a topic to represent the relationship. Let's give
the topic a name and a type, for readability:

  [act = "Act"]
  [thatcher-exploitation : act
     = "Mrs. Thatcher exploited by Gordon Brown"]

(In the Omnigator reification is shown by the text "More..."
Clicking on it makes the reifying topic the current topic.)

Now we have your "cloud of topics" represented by a single reified
association, and all we have to do is add an additional ("Denial")
association, which we can say involves an act and a denier:

  [denial    = "Denial"
             = "Denied by" / denier
             = "Denied" / act ]
  [denier    = "Denier"]

The association representing Nr. 10's denial then looks like this:

  [institution = "Institution"]
  [nr10 : institution = "Nr.10"]

  denial( thatcher-exploitation : act, nr10 : denier )

And there you have it.

The key, of course, is to use reification in order to make it
possible to annotate (say more about) the relationship represented
by the association.

Two final points:

(1) While this topic map looks OK in the Omnigator, a couple of
things are left to be desired:

* It would sometimes be useful to be able to see the role types
  clearly when you have an n-ary association. (You can see them
  if you mouse-over the role player, but that's not as intuitive
  as it could be.)

* It would be even more useful to see the role types when looking
  at the topic that reifies the association (under "Reification
  Topics").

* It's not really necessary to display "More..." twice for the
  same association; once would be enough.

(2) I haven't given a lot of thought to the role types used in
this example. It would be worth investigating the usefulness of
one or more of the sets of "thematic roles" [1] that have been
proposed in semantic linguistics (e.g. agent, author, instrument,
patient, benefactive, experiencer, theme, source, goal, locative,
reason, and purpose).

I hope this answers your question, Chattun. I've kept it fairly
simple, but it would be possible to go further and capture
stuff like tense and aspect. However, Topic Maps wasn't really
designed to do that, so I'm not sure I'd advise it.

Steve

[1] A.k.a. semantic cases (Fillmore), semantic roles (Dillon),
thematic relations (Gruber, Jackendoff), and (sort of) theta
roles (Chomsky, Marantz).

--------

COMPLETE TOPIC MAP

[person   = "Person"]
[brown    : person = "Gordon Brown"]
[thatcher : person = "Lady Thatcher"]

[exploitation = "Exploitation"
              = "Exploited by" / victim
              = "Exploited" / perpetrator] /* association type */
[perpetrator  = "Perpetrator"]             /* role type */
[victim       = "Victim"]                  /* role type */

[newspaper = "Newspaper"]
[daily_mail : newspaper = "Daily Mail"]

[means = "Means"]
[nr10invite : means = "Invitation to Nr. 10"]

exploitation( thatcher : victim,
              brown : perpetrator,
              nr10invite : means )
  / daily_mail
  ~ thatcher-exploitation

[act = "Act"]
[thatcher-exploitation : act
   = "Mrs. Thatcher exploited by Gordon Brown"]

[denial    = "Denial"
           = "Denied" / denier
           = "Denied by" / act ]
[denier    = "Denier"]

[institution = "Institution"]
[nr10 : institution = "Nr.10"]

denial( thatcher-exploitation : act, nr10 : denier )




More information about the topicmapmail mailing list