[topicmapmail] Fwd: CPAN release of WordNet::Similarity

Guy Murphy guy.murphy@easynet.co.uk
Thu, 10 Apr 2003 15:29:55 +0100


Hiyas.

Off the top of my head and not at all well considered....

Are the concepts "dog" and "wolf" similar to each other?.... rate on a scale
of 1 to 10, with 1 being "the same" and 10 being "not at all similar".

Are the concepts "house" and "river" similar to each other... rate on a
scale of 1 to 10.

Now we're applying metrics to semantic proximity... can be sliced other
ways, but it's no harder or easier that any aspect of a taxonomy, and is
incredibly useful in any automated system.

You create a taxonomy for a body of data... it's arbitrary. Either you'll do
it well and people will find it useful, or you'll do it badly and they
wont... same for semantic proximity.

For papers on the matter, run a search on "semantic spatial indexing".

Humans find spatial relationships useful as they're used to dealing with
them... is not really different than any other class of relationship,
especially weighted ones... is just a weighted relationship.... there not a
fully functional adult that one cant immediately start asking how near and
far concepts are from each other, we think that way... not everybody thinks
in terms of taxonomies as well.... hell you can ask children to arrange
animal dolls on the floor of a room in relation to how similar they are to
each other far easier than you can ask them to build a taxonomy.

Conflicting scopes are no easier to resolve in a taxonomy than a spacial
index.

If there is a difference it's that one can apply a logical non-arbitrary
consistency of distribution of distances within a spatial framework for the
whole body of data.... hence my comments about spatial indexing.... noting
that the indexing can be in more than 3 dimensions. Advantages of this being
that rather than track proximity relation to all other concepts, you decide
where to place a concept and the proximity is then a given.

It's not my ballpark so I'm reluctant to comment further as I'd be talking
about something for which I have no real background other than having kept
an eye out for papers and material on the mater than pass infront of my
nose.... at one point I did take a look at R-trees and the seemingly 101
related tree types as a prospect for a multi-dimensional index my team was
working on, but frankly it went over my head... I can't jump that high.

The rewards for the end-user of conceptual proximity (yep, I switched words)
are simply too great to dismiss the matter.

Cheers,
    Guy.

----- Original Message -----
From: "Murray Altheim" <m.altheim@open.ac.uk>
To: "Guy Murphy" <guy.murphy@easynet.co.uk>
Cc: "topicmapmail" <topicmapmail@infoloom.com>
Sent: Thursday, April 10, 2003 3:01 PM
Subject: Re: [topicmapmail] Fwd: CPAN release of WordNet::Similarity


> Guy Murphy wrote:
> > [snip]
> >
> >>I'm curious. How does one define "semantic distance", given that
> >>any metric is pretty arbitrary? Is this just a node count between
> >>two words based on whatever existing structure is there in the
> >>thesaurus?
> >
> > [snip]
> >
> > Semantics are completely arbitrary to begin with simply representing a
> > consensus with regard to meaning, so there's no problem with semantic
> > distance being arbitrary as long as people find it useful.
> >
> > Honestly, I'm not trying to be cute.
>
> Well, cute or not I don't consider consensus as arbitrary. We obviously
> find dictionaries and thesauri valuable otherwise we'd not create them,
> buy them, use them. I don't think of consensus as universal, but neither
> do I think it arbitrary. My skepticism only comes about when it comes
> to putting metrics on them, i.e., taking them outside of their initial
> human use, using them as input into mathematical (or at least program-
> matic functions. E.g., how "semantically far apart" are:
>
>       "dog"  -->  "mammal"
>       "dog"  -->  "mammalia"
>       "dog"  -->  "canine"
>       "dog"  -->  "Canis familiarus"
>       "dog"  -->  "Canis domesticus"
>       "dog"  -->  "Canis lupus"
>       "Canis domesticus"  -->  "Canis lupus"
>       "puppy" --> "dog"
>       "Poodle" -- "dog"
>       "Tony Blair" --> "poodle"
>       "Fido" -->  "dog"
>       "Dog"  -->  "dog"  (lexically)
>
> where the context in which the statement is made and used play a role?
> These are the kinds of real relationships that need to be modeled
> accurately if accurate decisions can be made upon the ontological
> commitments been made, explicitly or implicitly.
>
> > If you're interesting in semantic distance would you not perhaps find
more
> > fruit looking at spatial indexing?
>
> I tend to also be skeptical of mappings between semantic and spatial,
> as this mixed metaphor brings with it a lot of new issues. And I
> don't want any *new* issues... :-)  If you see some value there,
> perhaps a ref to a paper you found useful?
>
> Murray
>
> ......................................................................
> Murray Altheim                  <http://kmi.open.ac.uk/people/murray/>
> Knowledge Media Institute
> The Open University, Milton Keynes, Bucks, MK7 6AA, UK
>
>     Hunt the Boeing! And test your perceptions!
>     http://www.asile.org/citoyens/numero13/pentagone/erreurs_en.htm
>
>