W3C home > Mailing lists > Public > public-esw-thes@w3.org > June 2011

RE: skos:prefLabel without language tag

From: Armando Stellato <stellato@info.uniroma2.it>
Date: Fri, 24 Jun 2011 12:43:17 +0200
To: "'Bernard Vatant'" <bernard.vatant@mondeca.com>, "'Antoine Isaac'" <aisaac@few.vu.nl>
Cc: <public-esw-thes@w3.org>
Message-ID: <04d401cc325b$88385e30$98a91a90$@uniroma2.it>
Hi all,


agree with Bernard.


Even more, for how much it can seem restrictive (and possibly causing huge
panic for retrocompatibility with huge amount of existing data, but every
revolution has its heads chopped off…), I would think of a revision of SKOS
as *really* suggesting not to use (forbidding?)  prefLabels with no language
tag. One of the SKOS objectives was to give a decent coverage of the
linguistic descriptions of concept schemes (and ontologies in general, as
prefLabel is now an AnnotationProperty [S10] thus admitting any resource in
its domain), and thus a prefLabel with no language tag makes no sense to me.
One could say that plainLiterals could be used with no langtag to address
specific codes related to no natural language, but there are better options
for that (i.e. skos:notation).

In my experience, I’ve always had to make-do somehow with missing lang tags,
because usually those values still are explained in some language, so you
have to know it in advance, or guess it…so, lot of patches to any software
ever written for natural language querying over ontologies, to account for
the language assumed to be used for no-langtagged-literals. Collapsing
indexes for no-lang-tags with lang-tags of the same language etc…

This is a dirty work to be done when dealing with rdfs:label, but an highly
specified (and specific) property as prefLabel could surely better live
without “no-lang-tagged” plainLiterals.




From: public-esw-thes-request@w3.org [mailto:public-esw-thes-request@w3.org]
On Behalf Of Bernard Vatant
Sent: Friday, June 24, 2011 11:12 AM
To: Antoine Isaac
Cc: public-esw-thes@w3.org
Subject: Re: skos:prefLabel without language tag


Hello all

Thinking further about it, beyond the formal issue we have the question of
the expected behaviour of applications when meeting labels w/o language

In multilingual environments, the language tag is typically used to present
the concept to end users in their "user language". The unicity of the
prefLabel in the user language avoids clashes in the interface. Note that
some systems (e.g., Eurovoc and other OPOCE vocabularies) even require that
all concepts have a prefLabel in all supported user languages (e.g., EU
official languages), including default value rules (such as take the English
label if no label is available in Slovenian or Swedish). 

In our (Mondeca ITM) system, a label (aka "name") has also a mandatory and
unique language tag, but one possible value is "no language". The behaviour
of the system regarding this tag is that such names are displayed whatever
the user language choice. Of course if one wants unicity of the displayed
name, it implies that if there is a "no language" name, there is no (other)
name tagged with a language.

Translated in SKOS, this rule would look like :

If a Concept has a prefLabel value with no language tag, it cannot have a
different prefLabel value with a language tag.

IOW the following is not conformant 
ex:foo  skos:prefLabel 'A'; prefLabel 'B'@en

The following is conformant but somehow redundant
ex:foo  skos:prefLabel 'A'; prefLabel 'A'@en


2011/6/23 Antoine Isaac <aisaac@few.vu.nl>

On 6/23/11 8:40 PM, Alan Ruttenberg wrote:

On Thu, Jun 23, 2011 at 1:52 PM, Houghton,Andrew<houghtoa@oclc.org>  wrote:

Given these two situations:


<skos:prefLabel xml:lang=””>Dog</skos:prefLabel>

Does the inclusion of *both* prefLabel in a SKOS concept result in breaking
the rule S14 that no two prefLabel should have the same lexical value for
the same language tag?

My read is that S14 is not applicable. In both cases the lexical value
is the same - a plain literal without language tag. The RDFXML doesn't
state that the language tag is "". It is syntax for the absence of a
language tag. These two are different in the value space - without a
language tag it is a string, with a language tag it is a pair of
strings. The set of plain literals without language tags is *not* the
set of pairs (string , "").

Since the rule as stated applies to literals *with* language tags
(they can't be the same unless they are there), S14 would not seem to
be applicable.

That said, this looks like a hole in the spec. It was probably the
intention to also include the case that no two prefLabel without
language tag have the same lexical value.



Yes, it certainly was.

I have to admit I don't know if there is a hole. It may seem reasonable that
there exist some syntactic matching between literals having an empty tag and
literals having no tag, as Simon reports.

I think section 6.12 of the rdf syntax spec does result in the defaulting of
language to at least "" in production 7.2.16- there doesn't seem to be
another literal production that passes  the language feature.  I must admit
that I am not certain how general this assumption is- there are other specs
that seem to distinguish between <s> and <s,l>, but I think only  <s> \equiv
<s,""> is consistent?



However, this may be specific to one syntax.
The RDF abstract syntax and other specs are not mentioning that sort of
things. Especially, the way the identity conditions are spelled out at [1,2]
seem to argue against amalgamating absence of tag with presence of any tag
(including an empty one).

Anyway, it could be that the simplest thing to do is to publish an erratum
to clarify the original intent, rather than go into a discussion that is
difficult, and would perhaps just be against a moving target, as RDF is
currently being worked on... I'll forward the issue.




Bernard Vatant
Senior Consultant
Vocabulary & Data Integration
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com
3, cité Nollez 75018 Paris France
Web:    http://www.mondeca.com
Blog:    http://mondeca.wordpress.com
Received on Friday, 24 June 2011 10:44:08 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:46:11 UTC