W3C home > Mailing lists > Public > public-esw-thes@w3.org > June 2011

Re: skos:prefLabel without language tag

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Mon, 27 Jun 2011 10:29:57 -0400
Message-ID: <BANLkTimgjmWw_g5EwCajt03VNaO9Zar26g@mail.gmail.com>
To: Jon Phipps <jphipps@madcreek.com>
Cc: Antoine Isaac <aisaac@few.vu.nl>, public-esw-thes@w3.org
On Mon, Jun 27, 2011 at 8:04 AM, Jon Phipps <jphipps@madcreek.com> wrote:

> Hi Antoine,
> +1, I think, sortof, maybe. :-)
> It depends a bit on what you're saying.
> If we take the Open World assumption of the RDF data model into
> consideration, then it would seem reasonable to state to a _reasoner_ that a
> skos:prefLabel _must_ have a language tag, particularly given the intent of
> [S14], even if that language tag is currently unknown. Using Bernard's
> excellent example, this would imply to me at least that the 'conformance' of
> the following can't be determined without more information:
> ex:foo  skos:prefLabel 'A'; prefLabel 'B'@en
> And that the following isn't redundant, but rather supplies that
> information:
> ex:foo  skos:prefLabel 'A'; prefLabel 'A'@en

Unfortunately this isn't the case. There is no syntax for partially
specifying these data values. So the model theory has these as two labels.

> I think this is a somewhat separate issue from the one that you raised with
> the RDF folks:
> _If_ the specification _requires_ a language tag in order to determine
> conformance with [S14], does this:
>  ex:foo  skos:prefLabel 'A'
> infer this:
> ex:foo  skos:prefLabel 'A'@""

As I pointed out, the latter isn't valid, as the language tag needs to be
one specified in BCP47.

<foo xml:lang="">bar</foo> does not mean the value of the foo element is
'bar'@"". I means the value of the foo element is 'bar' (without language

Realize that the parsing of the syntax does not translate into what you
think would be the obvious translation. Perhaps recognizing that the
transformation parsetype attribute does a non obvious transformation will
help help emphasize that care should be made in understanding the difference
between what you see in a particular concrete syntax versus was is read into
the model.


> If that is the case, that would transform this:
>  ex:foo  skos:prefLabel 'A'; prefLabel 'B'@en
> into this:
> ex:foo  skos:prefLabel 'A'@""; prefLabel 'B'@en
> which is conformant with [S14], and this:
>  ex:foo  skos:prefLabel 'A'@""; prefLabel 'A'@en
> which is not conformant, _unless_ you consider that
>  ex:foo  skos:prefLabel 'A'@en
> is a higher-value, more refined replacement for
>  ex:foo  skos:prefLabel 'A'@""
> Bernard's refinement of the rule would seem to be an application-specific
> case, even though I think that rule of interpreting an empty language tag to
> mean 'all' or 'any' language rather than 'no language' is highly useful best
> practice. His rule has value in determining which labels to display or which
> concepts to return from a search, but this is slightly different than
> discussing conformance to [S14].
> I hope you get an answer from the rdf-wg, but I agree with you that what
> constitutes 'acceptable' data, especially when aggregating data from
> disparate systems should be broadly defined even if that is somewhat
> different than what defines 'conformance'. Postel's robust principle: "be
> liberal in what you accept; be conservative in what you send" provide's
> useful guidance.
> Jon
>  On Fri, Jun 24, 2011 at 7:07 AM, Antoine Isaac <aisaac@few.vu.nl> wrote:
>> Hi Armando, Bernard,
>> SKOS indeed encourages the use of language-tagged labels. This is why
>> almost all examples in the doc have language tags, and probably the reason
>> for which we now have to make S14 clearer--cf. our other discussion now.
>> But we also have to remain simple, and compatible with a wide range of
>> data. For many vocabularies, publishing language info is technically
>> difficult, or even impossible. This is especially the case for vocabularies
>> that have been aggregating labels originating from different languages, but
>> with data structures that do not allow (or make difficult) to track language
>> provenance.
>> Cheers,
>> Antoine
>>  Hi all,
>>> agree with Bernard.
>>> Even more, for how much it can seem restrictive (and possibly causing
>>> huge panic for retrocompatibility with huge amount of existing data, but
>>> every revolution has its heads chopped off…), I would think of a revision of
>>> SKOS as **really** suggesting not to use (forbidding?) prefLabels with no
>>> language tag. One of the SKOS objectives was to give a decent coverage of
>>> the linguistic descriptions of concept schemes (and ontologies in general,
>>> as prefLabel is now an AnnotationProperty [S10] thus admitting any resource
>>> in its domain), and thus a prefLabel with no language tag makes no sense to
>>> me. One could say that plainLiterals could be used with no langtag to
>>> address specific codes related to no natural language, but there are better
>>> options for that (i.e. skos:notation).
>>> In my experience, I’ve always had to make-do somehow with missing lang
>>> tags, because usually those values still are explained in some language, so
>>> you have to know it in advance, or guess it…so, lot of patches to any
>>> software ever written for natural language querying over ontologies, to
>>> account for the language assumed to be used for no-langtagged-literals.
>>> Collapsing indexes for no-lang-tags with lang-tags of the same language etc…
>>> This is a dirty work to be done when dealing with rdfs:label, but an
>>> highly specified (and specific) property as prefLabel could surely better
>>> live without “no-lang-tagged” plainLiterals.
>>> Armando
>>> *From:* public-esw-thes-request@w3.org [mailto:public-esw-thes-**
>>> request@w3.org <public-esw-thes-request@w3.org>] *On Behalf Of *Bernard
>>> Vatant
>>> *Sent:* Friday, June 24, 2011 11:12 AM
>>> *To:* Antoine Isaac
>>> *Cc:* public-esw-thes@w3.org
>>> *Subject:* Re: skos:prefLabel without language tag
>>> Hello all
>>> Thinking further about it, beyond the formal issue we have the question
>>> of the expected behaviour of applications when meeting labels w/o language
>>> tags.
>>> In multilingual environments, the language tag is typically used to
>>> present the concept to end users in their "user language". The unicity of
>>> the prefLabel in the user language avoids clashes in the interface. Note
>>> that some systems (e.g., Eurovoc and other OPOCE vocabularies) even require
>>> that all concepts have a prefLabel in all supported user languages (e.g., EU
>>> official languages), including default value rules (such as take the English
>>> label if no label is available in Slovenian or Swedish).
>>> In our (Mondeca ITM) system, a label (aka "name") has also a mandatory
>>> and unique language tag, but one possible value is "no language". The
>>> behaviour of the system regarding this tag is that such names are displayed
>>> whatever the user language choice. Of course if one wants unicity of the
>>> displayed name, it implies that if there is a "no language" name, there is
>>> no (other) name tagged with a language.
>>> Translated in SKOS, this rule would look like :
>>> *If a Concept has a prefLabel value with no language tag, it cannot have
>>> a different prefLabel value with a language tag.*
>>> IOW the following is not conformant
>>> ex:foo skos:prefLabel 'A'; prefLabel 'B'@en
>>> The following is conformant but somehow redundant
>>> ex:foo skos:prefLabel 'A'; prefLabel 'A'@en
>>> Bernard
>>> 2011/6/23 Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>>
>>> On 6/23/11 8:40 PM, Alan Ruttenberg wrote:
>>> On Thu, Jun 23, 2011 at 1:52 PM, Houghton,Andrew<houghtoa@oclc.**org<houghtoa@oclc.org><mailto:
>>> houghtoa@oclc.org>> wrote:
>>> Given these two situations:
>>> <skos:prefLabel>Dog</skos:**prefLabel>
>>> <skos:prefLabel xml:lang=””>Dog</skos:**prefLabel>
>>> Does the inclusion of *both* prefLabel in a SKOS concept result in
>>> breaking
>>> the rule S14 that no two prefLabel should have the same lexical value for
>>> the same language tag?
>>> My read is that S14 is not applicable. In both cases the lexical value
>>> is the same - a plain literal without language tag. The RDFXML doesn't
>>> state that the language tag is "". It is syntax for the absence of a
>>> language tag. These two are different in the value space - without a
>>> language tag it is a string, with a language tag it is a pair of
>>> strings. The set of plain literals without language tags is *not* the
>>> set of pairs (string , "").
>>> Since the rule as stated applies to literals *with* language tags
>>> (they can't be the same unless they are there), S14 would not seem to
>>> be applicable.
>>> That said, this looks like a hole in the spec. It was probably the
>>> intention to also include the case that no two prefLabel without
>>> language tag have the same lexical value.
>>> -Alan
>>> Yes, it certainly was.
>>> I have to admit I don't know if there is a hole. It may seem reasonable
>>> that there exist some syntactic matching between literals having an empty
>>> tag and literals having no tag, as Simon reports.
>>> I think section 6.12 of the rdf syntax spec does result in the defaulting
>>> of language to at least "" in production 7.2.16- there doesn't seem to be
>>> another literal production that passes the language feature. I must admit
>>> that I am not certain how general this assumption is- there are other specs
>>> that seem to distinguish between <s> and <s,l>, but I think only <s> \equiv
>>> <s,""> is consistent?
>>> Simon
>>> However, this may be specific to one syntax.
>>> The RDF abstract syntax and other specs are not mentioning that sort of
>>> things. Especially, the way the identity conditions are spelled out at [1,2]
>>> seem to argue against amalgamating absence of tag with presence of any tag
>>> (including an empty one).
>>> Anyway, it could be that the simplest thing to do is to publish an
>>> erratum to clarify the original intent, rather than go into a discussion
>>> that is difficult, and would perhaps just be against a moving target, as RDF
>>> is currently being worked on... I'll forward the issue.
>>> Cheers,
>>> Antoine
>>> [1]http://www.w3.org/TR/rdf-**concepts/#section-Literal-**Equality<http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality>
>>> [2] http://www.w3.org/TR/rdf-**plain-literal/#The_Comparison_**
>>> of_rdf:PlainLiteral_Data_**Values<http://www.w3.org/TR/rdf-plain-literal/#The_Comparison_of_rdf:PlainLiteral_Data_Values>
>>> --
>>> Bernard Vatant
>>> Senior Consultant
>>> Vocabulary & Data Integration
>>> Tel: +33 (0) 971 488 459
>>> Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@**mondeca.com<bernard.vatant@mondeca.com>
>>> >
>>> ------------------------------**----------------------
>>> Mondeca
>>> 3, cité Nollez 75018 Paris France
>>> Web: http://www.mondeca.com
>>> Blog: http://mondeca.wordpress.com
>>> ------------------------------**----------------------
> --
> Jon
> I check email just a couple of times daily; to reach me sooner, click here:
> http://awayfind.com/jonphipps
Received on Monday, 27 June 2011 14:31:14 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:15:03 UTC