RE: SKOS & SIMILE, concepts, terms, URIs, mappings from Miles, AJ (Alistair) on 2004-06-07 (public-esw-thes@w3.org from June 2004)

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Mon, 7 Jun 2004 17:28:50 +0100
To: 'Stefano Mazzocchi' <stefano@apache.org>, "(www-rdf-dspace@w3.org)" <www-rdf-dspace@w3.org>
Cc: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <350DC7048372D31197F200902773DF4C049442E2@exchange11.rl.ac.uk>

Hi Stefano and Simile folks,

Sorry I missed this, better late than never tho ...

> 
> If I understood you correctly, you are suggesting that we identify 
> 'concepts' with URIs that do not contain anything to do with 
> the actual 
> literal that is used to describe them in a particular context.
> 
> So, something like
> 
>   http://simile.mit.edu/concept/2839488484
> 
> instead of
> 
>   http://simile.mit.edu/concept/maternity_services
> 
> is that correct?
> 
> -- 
> Stefano.
> 

I think using a numbering scheme to build a URI for a concept is generally a
better idea, as it helps to emphasise the fact that that resource being
identified is a piece of meaning, which is distinct from any lexical or
symbolic labels.  It also makes the URI scheme language independent (handy
for politically sensitive multilingual projects :)  

However it is by no means necessary to do so.  But, if you are going to use
the preferred label for a concept as part of the URI for that concept, you
must be absolutely clear about the fact that that URI is an identifier for
the concept itself, and NOT the specific label.

To put it another way, you must be clear about whether a URI such as 

 http://simile.mit.edu/concept/maternity_services

is (a) an identifier for the lexical string 'maternity services' or (b) an
identifier for a concept whose preferred label is the string 'maternity
services', whose alternative label is the string 'blah' and whose scope note
is 'blah blah blah'.

If this distinction is not made clearly, then everyone gets very confused
when you want to make statements such as 'concept A from thesaurus T is
broader in meaning than concept B from thesaurus U' and 'the preferred label
for concept A is identical to an alternative label for concept B'.

Just as an interesting sort of aside ... the first statement above is an
example of what I would call a 'semantic mapping', the second statement an
example of what I would call a 'lexical mapping'.  What's interesting is
that lexical mappings can be discovered by a computer alone, using stemming
algorithms and string comparisons etc.  The lexical mappings can then be
used to highlight possible semantic mappings, and defining semantic mappings
is something that only a person can do.   

Another example of a lexical mapping would be 'the preferred label for
concept A is the plural form of the preferred label for concept B'.

Another (the most obvious) example of a semantic mapping would be 'concept A
is identical in meaning to concept B' (which could easily occur even if
their respective preferred labels where not identical). 

The SKOS-Mapping schema [1] was designed to take care of semantic mappings.
But is there a requirement for expressing lexical mappings in RDF, and if
so, how do we do it without getting rather confused?  

The reason why I previously didn't think expressing lexical mappings in RDF
was really necessary was because they could be automatically discovered by a
computer process at any time.  But then I guess storing and publishing the
results of that process could be useful, especially when operating on large
data sets. 

Anyway, it would be great to get some feedback on this question.

Yours,

Alistair.
 
[1] http://www.w3c.rl.ac.uk/SWAD/deliverables/8.4.html

Received on Monday, 7 June 2004 12:29:23 UTC