W3C home > Mailing lists > Public > public-esw-thes@w3.org > June 2011

Re: skos:prefLabel without language tag

From: Jon Phipps <jphipps@madcreek.com>
Date: Mon, 27 Jun 2011 07:04:30 -0500
Message-ID: <BANLkTi=MONn59FgdvdWVyq=Pg-8K9whqUQ@mail.gmail.com>
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: public-esw-thes@w3.org
Hi Antoine,

+1, I think, sortof, maybe. :-)
It depends a bit on what you're saying.

If we take the Open World assumption of the RDF data model into
consideration, then it would seem reasonable to state to a _reasoner_ that a
skos:prefLabel _must_ have a language tag, particularly given the intent of
[S14], even if that language tag is currently unknown. Using Bernard's
excellent example, this would imply to me at least that the 'conformance' of
the following can't be determined without more information:
ex:foo  skos:prefLabel 'A'; prefLabel 'B'@en

And that the following isn't redundant, but rather supplies that
information:
ex:foo  skos:prefLabel 'A'; prefLabel 'A'@en

I think this is a somewhat separate issue from the one that you raised with
the RDF folks:
_If_ the specification _requires_ a language tag in order to determine
conformance with [S14], does this:
ex:foo  skos:prefLabel 'A'
infer this:
ex:foo  skos:prefLabel 'A'@""

If that is the case, that would transform this:
ex:foo  skos:prefLabel 'A'; prefLabel 'B'@en
into this:
ex:foo  skos:prefLabel 'A'@""; prefLabel 'B'@en
which is conformant with [S14], and this:
 ex:foo  skos:prefLabel 'A'@""; prefLabel 'A'@en
which is not conformant, _unless_ you consider that
 ex:foo  skos:prefLabel 'A'@en
is a higher-value, more refined replacement for
 ex:foo  skos:prefLabel 'A'@""

Bernard's refinement of the rule would seem to be an application-specific
case, even though I think that rule of interpreting an empty language tag to
mean 'all' or 'any' language rather than 'no language' is highly useful best
practice. His rule has value in determining which labels to display or which
concepts to return from a search, but this is slightly different than
discussing conformance to [S14].

I hope you get an answer from the rdf-wg, but I agree with you that what
constitutes 'acceptable' data, especially when aggregating data from
disparate systems should be broadly defined even if that is somewhat
different than what defines 'conformance'. Postel's robust principle: "be
liberal in what you accept; be conservative in what you send" provide's
useful guidance.

Jon

On Fri, Jun 24, 2011 at 7:07 AM, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi Armando, Bernard,
>
> SKOS indeed encourages the use of language-tagged labels. This is why
> almost all examples in the doc have language tags, and probably the reason
> for which we now have to make S14 clearer--cf. our other discussion now.
>
> But we also have to remain simple, and compatible with a wide range of
> data. For many vocabularies, publishing language info is technically
> difficult, or even impossible. This is especially the case for vocabularies
> that have been aggregating labels originating from different languages, but
> with data structures that do not allow (or make difficult) to track language
> provenance.
>
> Cheers,
>
> Antoine
>
>
>  Hi all,
>>
>> agree with Bernard.
>>
>> Even more, for how much it can seem restrictive (and possibly causing huge
>> panic for retrocompatibility with huge amount of existing data, but every
>> revolution has its heads chopped off…), I would think of a revision of SKOS
>> as **really** suggesting not to use (forbidding?) prefLabels with no
>> language tag. One of the SKOS objectives was to give a decent coverage of
>> the linguistic descriptions of concept schemes (and ontologies in general,
>> as prefLabel is now an AnnotationProperty [S10] thus admitting any resource
>> in its domain), and thus a prefLabel with no language tag makes no sense to
>> me. One could say that plainLiterals could be used with no langtag to
>> address specific codes related to no natural language, but there are better
>> options for that (i.e. skos:notation).
>>
>> In my experience, I’ve always had to make-do somehow with missing lang
>> tags, because usually those values still are explained in some language, so
>> you have to know it in advance, or guess it…so, lot of patches to any
>> software ever written for natural language querying over ontologies, to
>> account for the language assumed to be used for no-langtagged-literals.
>> Collapsing indexes for no-lang-tags with lang-tags of the same language etc…
>>
>> This is a dirty work to be done when dealing with rdfs:label, but an
>> highly specified (and specific) property as prefLabel could surely better
>> live without “no-lang-tagged” plainLiterals.
>>
>> Armando
>>
>> *From:* public-esw-thes-request@w3.org [mailto:public-esw-thes-**
>> request@w3.org <public-esw-thes-request@w3.org>] *On Behalf Of *Bernard
>> Vatant
>> *Sent:* Friday, June 24, 2011 11:12 AM
>> *To:* Antoine Isaac
>> *Cc:* public-esw-thes@w3.org
>> *Subject:* Re: skos:prefLabel without language tag
>>
>> Hello all
>>
>> Thinking further about it, beyond the formal issue we have the question of
>> the expected behaviour of applications when meeting labels w/o language
>> tags.
>>
>> In multilingual environments, the language tag is typically used to
>> present the concept to end users in their "user language". The unicity of
>> the prefLabel in the user language avoids clashes in the interface. Note
>> that some systems (e.g., Eurovoc and other OPOCE vocabularies) even require
>> that all concepts have a prefLabel in all supported user languages (e.g., EU
>> official languages), including default value rules (such as take the English
>> label if no label is available in Slovenian or Swedish).
>>
>> In our (Mondeca ITM) system, a label (aka "name") has also a mandatory and
>> unique language tag, but one possible value is "no language". The behaviour
>> of the system regarding this tag is that such names are displayed whatever
>> the user language choice. Of course if one wants unicity of the displayed
>> name, it implies that if there is a "no language" name, there is no (other)
>> name tagged with a language.
>>
>> Translated in SKOS, this rule would look like :
>>
>> *If a Concept has a prefLabel value with no language tag, it cannot have a
>> different prefLabel value with a language tag.*
>>
>> IOW the following is not conformant
>> ex:foo skos:prefLabel 'A'; prefLabel 'B'@en
>>
>> The following is conformant but somehow redundant
>> ex:foo skos:prefLabel 'A'; prefLabel 'A'@en
>>
>> Bernard
>>
>> 2011/6/23 Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>>
>>
>>
>> On 6/23/11 8:40 PM, Alan Ruttenberg wrote:
>>
>> On Thu, Jun 23, 2011 at 1:52 PM, Houghton,Andrew<houghtoa@oclc.**org<houghtoa@oclc.org><mailto:
>> houghtoa@oclc.org>> wrote:
>>
>> Given these two situations:
>>
>>
>>
>> <skos:prefLabel>Dog</skos:**prefLabel>
>>
>> <skos:prefLabel xml:lang=””>Dog</skos:**prefLabel>
>>
>> Does the inclusion of *both* prefLabel in a SKOS concept result in
>> breaking
>> the rule S14 that no two prefLabel should have the same lexical value for
>> the same language tag?
>>
>>
>> My read is that S14 is not applicable. In both cases the lexical value
>> is the same - a plain literal without language tag. The RDFXML doesn't
>> state that the language tag is "". It is syntax for the absence of a
>> language tag. These two are different in the value space - without a
>> language tag it is a string, with a language tag it is a pair of
>> strings. The set of plain literals without language tags is *not* the
>> set of pairs (string , "").
>>
>> Since the rule as stated applies to literals *with* language tags
>> (they can't be the same unless they are there), S14 would not seem to
>> be applicable.
>>
>> That said, this looks like a hole in the spec. It was probably the
>> intention to also include the case that no two prefLabel without
>> language tag have the same lexical value.
>>
>> -Alan
>>
>> Yes, it certainly was.
>>
>> I have to admit I don't know if there is a hole. It may seem reasonable
>> that there exist some syntactic matching between literals having an empty
>> tag and literals having no tag, as Simon reports.
>>
>>
>>
>> I think section 6.12 of the rdf syntax spec does result in the defaulting
>> of language to at least "" in production 7.2.16- there doesn't seem to be
>> another literal production that passes the language feature. I must admit
>> that I am not certain how general this assumption is- there are other specs
>> that seem to distinguish between <s> and <s,l>, but I think only <s> \equiv
>> <s,""> is consistent?
>>
>> Simon
>>
>> However, this may be specific to one syntax.
>> The RDF abstract syntax and other specs are not mentioning that sort of
>> things. Especially, the way the identity conditions are spelled out at [1,2]
>> seem to argue against amalgamating absence of tag with presence of any tag
>> (including an empty one).
>>
>> Anyway, it could be that the simplest thing to do is to publish an erratum
>> to clarify the original intent, rather than go into a discussion that is
>> difficult, and would perhaps just be against a moving target, as RDF is
>> currently being worked on... I'll forward the issue.
>>
>> Cheers,
>>
>> Antoine
>>
>> [1]http://www.w3.org/TR/rdf-**concepts/#section-Literal-**Equality<http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality>
>> [2] http://www.w3.org/TR/rdf-**plain-literal/#The_Comparison_**
>> of_rdf:PlainLiteral_Data_**Values<http://www.w3.org/TR/rdf-plain-literal/#The_Comparison_of_rdf:PlainLiteral_Data_Values>
>>
>>
>>
>>
>> --
>> Bernard Vatant
>> Senior Consultant
>> Vocabulary & Data Integration
>> Tel: +33 (0) 971 488 459
>> Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@**mondeca.com<bernard.vatant@mondeca.com>
>> >
>>
>> ------------------------------**----------------------
>> Mondeca
>> 3, cité Nollez 75018 Paris France
>> Web: http://www.mondeca.com
>> Blog: http://mondeca.wordpress.com
>> ------------------------------**----------------------
>>
>>
>
>


-- 
Jon

I check email just a couple of times daily; to reach me sooner, click here:
http://awayfind.com/jonphipps
Received on Monday, 27 June 2011 12:05:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 June 2011 12:05:21 GMT