W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2004

RE: lexical relationships

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Thu, 7 Oct 2004 13:38:01 +0100
Message-ID: <350DC7048372D31197F200902773DF4C05E50C7C@exchange11.rl.ac.uk>
To: 'Stella Dextre Clarke' <sdclarke@lukehouse.demon.co.uk>, public-esw-thes@w3.org

Hi Stella, all,

Actually I think there are two issues here.

I think it is perfectly reasonable to include abbreviations and spelling
variants among the altLabels of a concept.

Also, I think it is entirely reasonable for people to define sub-properties
of skos:altLabel e.g. 'abbreviatedLabel' or 'spellingVariantLabel' ... the
second of these for example would allow you to include spelling variants as
labels of a concept for purposes of retrieval, and also design viewing apps
that hid the spelling variant labels from users browsing the thesaurus.

However, there is some information that is not captured by this.  For
example, if I have the concept:

<rdf:RDF>

  <skos:Concept rdf:avout="someURI">
	<skos:prefLabel>Acquired Immunodeficiency Syndrome</skos:prefLabel>
	<ex:acronymLabel>AIDS</ex:acronymLabel>	
	<skos:altLabel>Human Immunodeficiency Virus</skos:altLabel>
	<ex:acronymLabel>HIV</ex:acronymLabel>
  </skos:Concept>

</rdf:RDF>

... I *cannot* tell from this that 'HIV' is an acronym for 'Human
Immunodeficiency Virus' and that 'AIDS' is an acronym for 'Acquired
Immunodeficiency Syndrome'.

The proposition is to capture this specific (lexical?) relationship, for
example (just brainstorming here) ...

<rdf:RDF>

  <ex:Acronym>
    <ex:normalForm>Acquired Immunodeficiency Syndrome</ex:normalForm>
    <ex:acronymForm>AIDS</ex:acronymForm>
  </ex:Acronym>  

</rdf:RDF>

or ...

<rdf:RDF>

  <ex:SpellingVariant>
    <ex:normalForm>hemorrhaged</ex:normalForm>
    <ex:variantForm>haemorrhaged</ex:variantForm>
  </ex:SpellingVariant>

</rdf:RDF>

... don't know if I got that the right way round, but it makes the point.
N.B. also that these relationships can (??) be asserted independently of a
connection to any concepts.

Hopes that moves things along.

Al.


---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440



> -----Original Message-----
> From: Stella Dextre Clarke [mailto:sdclarke@lukehouse.demon.co.uk]
> Sent: 07 October 2004 13:08
> To: 'Miles, AJ (Alistair) '; public-esw-thes@w3.org
> Subject: RE: lexical relationships
> 
> 
> Not sure why this is coming up now, as I thought you had preflabel and
> altlabel pretty much worked out. And for me this pair is the same as
> USE/UF in a thesaurus, which we can also spell out as  the equivalence
> relationship between terms (not concepts).
> 
> ISO 2788 points out that USE/UF can be used to cover several different
> situations. These include:
> - common versus scientific names (e.g. Rosa canina, dog rose)
> - common versus trade names (hoovers, vacuum cleaners)
> - full name versus abbreviation (VAT, value added tax)
> - synonyms with different linguistic origin (freedom, liberty)
> - spelling variants (ground water, groundwater; oedema,edema)
> - irregular singular/plurals (goose, geese)
> - and others (see ISO 2788)
> ISO 2788 does not prohibit you from treating these as subdivisions and
> giving them separate tags or designations. Some thesauri do this,
> especially in the case of abbreviations. It is good to give people the
> flexibility to do this, just as with subdividing BT/NT into BTG/NTG,
> BTI/NTI, etc.  But unless there is some good housekeeping reason, it
> seems to me like a lot of trouble for little reward.
> 
> And why treat lexical variants as different from all the 
> rest? They can
> be very useful indeed for retrieval purposes. I don't think we should
> think of them as being "not semantic" because that implies that the
> others ARE semantic, and that is questionable. Even in the case of
> freedom versus liberty, remember that in the thesaurus these are
> considered to be two alternative labels for one and the same 
> concept. If
> there is only one concept, how can there be a semantic difference?
> 
> A stronger case can be made for separating out any lexical "variants"
> that are actually misspellings, e.g. abatoirs. These can be useful in
> retrieval, but they look a bit offensive and the editor may 
> wish to hide
> them from view. 
> 
> So I'd say that subdividing this relationship has rather less urgency
> than subdividing BT/NT or RT/RT. But do include the lexical variants
> with the other synonyms.
> 
> Stella
> 
> 
> *****************************************************
> Stella Dextre Clarke
> Information Consultant
> Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
> Tel: 01235-833-298
> Fax: 01235-863-298
> SDClarke@LukeHouse.demon.co.uk
> *****************************************************
> 
> 
> 
> -----Original Message-----
> From: public-esw-thes-request@w3.org
> [mailto:public-esw-thes-request@w3.org] On Behalf Of Miles, AJ
> (Alistair) 
> Sent: 06 October 2004 16:19
> To: 'public-esw-thes@w3.org'
> Subject: lexical relationships
> 
> 
> 
> This is just to guage interest in doing this kind of thing in 
> SKOS Core
> ...
> 
> Are we interested in being able to represent things like spelling
> variant relationships, abbreviation relationships, literal translation
> relationships, in SKOS Core?
> 
> I group these under the label 'lexical relationships' because 
> they seem
> different from 'semantic relationships', although a lexical 
> relationship
> may not be completely independent of the sense of the terms involved.
> 
> Also, this stuff seems pretty close to wordnet.
> 
> ???
> 
> Al.
> 
> ---
> Alistair Miles
> Research Associate
> CCLRC - Rutherford Appleton Laboratory
> Building R1 Room 1.60
> Fermi Avenue
> Chilton
> Didcot
> Oxfordshire OX11 0QX
> United Kingdom
> Email:        a.j.miles@rl.ac.uk
> Tel: +44 (0)1235 445440
> 
> 
Received on Thursday, 7 October 2004 12:38:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:04 UTC