RE: SKOS & SIMILE, concepts, terms, URIs, mappings from Butler, Mark on 2004-06-08 (www-rdf-dspace@w3.org from June 2004)

From: Butler, Mark <mark-h.butler@hp.com>
Date: Tue, 8 Jun 2004 14:38:59 +0100
To: "(www-rdf-dspace@w3.org)" <www-rdf-dspace@w3.org>
Cc: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <E864E95CB35C1C46B72FEA0626A2E808ED23F6@0-mail-br1.hpl.hp.com>

Hi Alistair

> I have some ideas about how to express lexical mappings 
> that does not require the altLabels to have their own URI.

I would be interested to hear your proposals here?

Here is the use case we had in SIMILE:

I have some Artstor data that uses the term "cadavers" (which is a preferred
term in the Artstor data), and I want to map onto the LOC TGM thesaurus. In
LOC TGM, cadavers is an alternative term for both "dead animals" and "dead
persons". Therefore, my guess is LOC decided that the term "cadavers" was
ambiguous, so they decided to encourage cataloguers to use the two less
ambiguous terms. However here the concept corresponding to cadaver is
actually the union of the concepts that have "dead animals" and "dead
persons" as their primary terms. 

So if I want to create a mapping between the Artstor data and the LOC TGM,
how should I do it using SKOS for the record that uses the term "cadavers"?
I think I want a semantic mapping here, even though I am mapping to an
alternative term?

As well as the use case above, I think there are a number of other things
that seem difficult due to not giving altLabels their own URIs - perhaps you
also have workarounds for them?

- how can you provide versions of an altLabel in multiple languages?

- lots of web APIs (for example fetch, the Longwell and Brownsauce browsers)
use URIs to identify objects. Is there a way of identifying altLabels that
is compatible with these APIs?

BTW, I did discuss with Damian Steer, and he pointed out the multiple
language problem could be solved using bNodes. This doesn't help the web API
problem, but FOAF uses bNodes in a similar way, so some web APIs are coming
up with ways to solve the problem for the FOAF case, so perhaps an
alternative solution might be to use a bNode?

> RE: lexical mappings

> This is splitting hairs a little bit, but I think it would 
> be more accurate to say that a lexical mapping may or may 
> not reflect a close semantic mapping.  So a lexical mapping 
> is never 'incorrect' if it captures some sort of lexical 
> similarity between labels, even if there is no semantic 
> mapping between the corresponding concepts. 

We used edit distance measures, so taking this approach "fountains" and
"mountains" are as close as "corpse" and "corpses". I was thinking of the
first one as being "incorrect" whereas the second one is "correct".

Now conceptually you are right that they are both correct lexical mappings.
So maybe what I am actually doing is generating lexical mappings, then human
reviewing them to turn them into semantic mappings, only the change from
lexical mappings to semantic mapping is not explicit (because I don't fancy
doing lots of retyping by change the property names in the N3!)

This is a pragmatic approach for now - longer term, I think SIMILE intends
to develop tools to do this type of task, so then it might be possible to
change the properties so they actually reflect when a mapping relation is
changed from being just a lexical mapping to a semantic mapping.

> Could you possibly provide me with a list 
> of all the types of useful lexical mapping you 
> can think of?  

No, sorry. I guess people like the WordNet and Cyc communities have thought
about this, and you may be more familiar with this work than I am?

best regards

Dr Mark H. Butler
Research Scientist, HP Labs Bristol
http://www-uk.hpl.hp.com/people/marbut

Received on Tuesday, 8 June 2004 09:50:35 UTC