# Re: skos:prefLabel without language tag

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Mon, 27 Jun 2011 18:12:57 +0200
Message-ID: <4E08AC09.9000608@few.vu.nl>

Hi,

Yes, the issue seems to be really tricky, between what happens at the syntax level, and what happens at the model level. Personally I'll wait till I receive some feedback from the RDF group--my email is still blocked btw.

Now, the issue of whether we should recommend or require language tags, as Jon, Bernard and Armando suggest (in different fashions), remains. Again my take is to go for softer, best practice-kind of encouragement: in particular, sending messages related to the notion of "conformance" seems quite strong [1].
If instead available SKOS reference tooling (just thinking of PoolParty's or Mondeca's tools) include a warning, when tag-less labels are found, this would help sending an appropriate message, I think! Hmm, maybe you've already implemented it... And maybe there's something in the Metadata Registry that entices the user to assign a languagetag as well?

One may think of adding some sentences to the specification document, making soft recommendations. But changing published W3C recs is nearly impossible, and errata are usually used for "bugs" in the specs, only, like the potential one for S14. Plus, the docs are already making some effort in that direction: the SKOS Primer has only one example of tag-less literal, and it's in a very specific context (notations)...

If we want really to centralize best practice, then anyone is free to use the SKOS wiki and create a dedicated page, gathering all these softer "warnings" a reasoner could issue when ingesting SKOS data. If it gets decent consensus (and stability), we could even easily port it to http://www.w3.org/2004/02/skos/, under a "best practices" heading: that site is not a formal W3C recommendation document.

Cheers,

Antoine

[1] http://www.w3.org/TR/2009/REC-skos-reference-20090818/#L434

> On Mon, Jun 27, 2011 at 8:04 AM, Jon Phipps <jphipps@madcreek.com <mailto:jphipps@madcreek.com>> wrote:
>
>     Hi Antoine,
>
>     +1, I think, sortof, maybe. :-)
>     It depends a bit on what you're saying.
>
>     If we take the Open World assumption of the RDF data model into consideration, then it would seem reasonable to state to a _reasoner_ that a skos:prefLabel _must_ have a language tag, particularly given the intent of [S14], even if that language tag is currently unknown. Using Bernard's excellent example, this would imply to me at least that the 'conformance' of the following can't be determined without more information:
>     ex:foo skos:prefLabel 'A'; prefLabel 'B'@en
>
>     And that the following isn't redundant, but rather supplies that information:
>     ex:foo skos:prefLabel 'A'; prefLabel 'A'@en
>
>
> Unfortunately this isn't the case. There is no syntax for partially specifying these data values. So the model theory has these as two labels.
>
>
>     I think this is a somewhat separate issue from the one that you raised with the RDF folks:
>     _If_ the specification _requires_ a language tag in order to determine conformance with [S14], does this:
>     ex:foo skos:prefLabel 'A'
>     infer this:
>     ex:foo skos:prefLabel 'A'@""
>
>
> As I pointed out, the latter isn't valid, as the language tag needs to be one specified in BCP47.
>
> <foo xml:lang="">bar</foo> does not mean the value of the foo element is 'bar'@"". I means the value of the foo element is 'bar' (without language tag).
>
> Realize that the parsing of the syntax does not translate into what you think would be the obvious translation. Perhaps recognizing that the transformation parsetype attribute does a non obvious transformation will help help emphasize that care should be made in understanding the difference between what you see in a particular concrete syntax versus was is read into the model.
>
> -Alan
>
>
>     If that is the case, that would transform this:
>     ex:foo skos:prefLabel 'A'; prefLabel 'B'@en
>     into this:
>     ex:foo skos:prefLabel 'A'@""; prefLabel 'B'@en
>     which is conformant with [S14], and this:
>     ex:foo skos:prefLabel 'A'@""; prefLabel 'A'@en
>     which is not conformant, _unless_ you consider that
>     ex:foo skos:prefLabel 'A'@en
>     is a higher-value, more refined replacement for
>     ex:foo skos:prefLabel 'A'@""
>
>     Bernard's refinement of the rule would seem to be an application-specific case, even though I think that rule of interpreting an empty language tag to mean 'all' or 'any' language rather than 'no language' is highly useful best practice. His rule has value in determining which labels to display or which concepts to return from a search, but this is slightly different than discussing conformance to [S14].
>
>     I hope you get an answer from the rdf-wg, but I agree with you that what constitutes 'acceptable' data, especially when aggregating data from disparate systems should be broadly defined even if that is somewhat different than what defines 'conformance'. Postel's robust principle: "be liberal in what you accept; be conservative in what you send" provide's useful guidance.
>
>     Jon
>
>     On Fri, Jun 24, 2011 at 7:07 AM, Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>> wrote:
>
>         Hi Armando, Bernard,
>
>         SKOS indeed encourages the use of language-tagged labels. This is why almost all examples in the doc have language tags, and probably the reason for which we now have to make S14 clearer--cf. our other discussion now.
>
>         But we also have to remain simple, and compatible with a wide range of data. For many vocabularies, publishing language info is technically difficult, or even impossible. This is especially the case for vocabularies that have been aggregating labels originating from different languages, but with data structures that do not allow (or make difficult) to track language provenance.
>
>         Cheers,
>
>         Antoine
>
>
>             Hi all,
>
>             agree with Bernard.
>
>             Even more, for how much it can seem restrictive (and possibly causing huge panic for retrocompatibility with huge amount of existing data, but every revolution has its heads chopped off…), I would think of a revision of SKOS as **really** suggesting not to use (forbidding?) prefLabels with no language tag. One of the SKOS objectives was to give a decent coverage of the linguistic descriptions of concept schemes (and ontologies in general, as prefLabel is now an AnnotationProperty [S10] thus admitting any resource in its domain), and thus a prefLabel with no language tag makes no sense to me. One could say that plainLiterals could be used with no langtag to address specific codes related to no natural language, but there are better options for that (i.e. skos:notation).
>
>             In my experience, I’ve always had to make-do somehow with missing lang tags, because usually those values still are explained in some language, so you have to know it in advance, or guess it…so, lot of patches to any software ever written for natural language querying over ontologies, to account for the language assumed to be used for no-langtagged-literals. Collapsing indexes for no-lang-tags with lang-tags of the same language etc…
>
>             This is a dirty work to be done when dealing with rdfs:label, but an highly specified (and specific) property as prefLabel could surely better live without “no-lang-tagged” plainLiterals.
>
>             Armando
>
>             *From:* public-esw-thes-request@w3.org <mailto:public-esw-thes-request@w3.org> [mailto:public-esw-thes-__request@w3.org <mailto:public-esw-thes-request@w3.org>] *On Behalf Of *Bernard Vatant
>             *Sent:* Friday, June 24, 2011 11:12 AM
>             *To:* Antoine Isaac
>             *Cc:* public-esw-thes@w3.org <mailto:public-esw-thes@w3.org>
>             *Subject:* Re: skos:prefLabel without language tag
>
>             Hello all
>
>             Thinking further about it, beyond the formal issue we have the question of the expected behaviour of applications when meeting labels w/o language tags.
>
>             In multilingual environments, the language tag is typically used to present the concept to end users in their "user language". The unicity of the prefLabel in the user language avoids clashes in the interface. Note that some systems (e.g., Eurovoc and other OPOCE vocabularies) even require that all concepts have a prefLabel in all supported user languages (e.g., EU official languages), including default value rules (such as take the English label if no label is available in Slovenian or Swedish).
>
>             In our (Mondeca ITM) system, a label (aka "name") has also a mandatory and unique language tag, but one possible value is "no language". The behaviour of the system regarding this tag is that such names are displayed whatever the user language choice. Of course if one wants unicity of the displayed name, it implies that if there is a "no language" name, there is no (other) name tagged with a language.
>
>             Translated in SKOS, this rule would look like :
>
>             *If a Concept has a prefLabel value with no language tag, it cannot have a different prefLabel value with a language tag.*
>
>             IOW the following is not conformant
>             ex:foo skos:prefLabel 'A'; prefLabel 'B'@en
>
>             The following is conformant but somehow redundant
>             ex:foo skos:prefLabel 'A'; prefLabel 'A'@en
>
>             Bernard
>
>             2011/6/23 Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl> <mailto:aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>>>
>
>
>             On 6/23/11 8:40 PM, Alan Ruttenberg wrote:
>
>             On Thu, Jun 23, 2011 at 1:52 PM, Houghton,Andrew<houghtoa@oclc.__org <mailto:houghtoa@oclc.org> <mailto:houghtoa@oclc.org <mailto:houghtoa@oclc.org>>> wrote:
>
>             Given these two situations:
>
>
>
>             <skos:prefLabel>Dog</skos:__prefLabel>
>
>             <skos:prefLabel xml:lang=””>Dog</skos:__prefLabel>
>
>             Does the inclusion of *both* prefLabel in a SKOS concept result in breaking
>             the rule S14 that no two prefLabel should have the same lexical value for
>             the same language tag?
>
>
>             My read is that S14 is not applicable. In both cases the lexical value
>             is the same - a plain literal without language tag. The RDFXML doesn't
>             state that the language tag is "". It is syntax for the absence of a
>             language tag. These two are different in the value space - without a
>             language tag it is a string, with a language tag it is a pair of
>             strings. The set of plain literals without language tags is *not* the
>             set of pairs (string , "").
>
>             Since the rule as stated applies to literals *with* language tags
>             (they can't be the same unless they are there), S14 would not seem to
>             be applicable.
>
>             That said, this looks like a hole in the spec. It was probably the
>             intention to also include the case that no two prefLabel without
>             language tag have the same lexical value.
>
>             -Alan
>
>             Yes, it certainly was.
>
>             I have to admit I don't know if there is a hole. It may seem reasonable that there exist some syntactic matching between literals having an empty tag and literals having no tag, as Simon reports.
>
>
>
>             I think section 6.12 of the rdf syntax spec does result in the defaulting of language to at least "" in production 7.2.16- there doesn't seem to be another literal production that passes the language feature. I must admit that I am not certain how general this assumption is- there are other specs that seem to distinguish between <s> and <s,l>, but I think only <s> \equiv <s,""> is consistent?
>
>             Simon
>
>             However, this may be specific to one syntax.
>             The RDF abstract syntax and other specs are not mentioning that sort of things. Especially, the way the identity conditions are spelled out at [1,2] seem to argue against amalgamating absence of tag with presence of any tag (including an empty one).
>
>             Anyway, it could be that the simplest thing to do is to publish an erratum to clarify the original intent, rather than go into a discussion that is difficult, and would perhaps just be against a moving target, as RDF is currently being worked on... I'll forward the issue.
>
>             Cheers,
>
>             Antoine
>
>             [1]http://www.w3.org/TR/rdf-__concepts/#section-Literal-__Equality <http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality>
>             [2] http://www.w3.org/TR/rdf-__plain-literal/#The_Comparison___of_rdf:PlainLiteral_Data___Values <http://www.w3.org/TR/rdf-plain-literal/#The_Comparison_of_rdf:PlainLiteral_Data_Values>
>
>
>
>
>             --
>             Bernard Vatant
>             Senior Consultant
>             Vocabulary & Data Integration
>             Tel: +33 (0) 971 488 459 <tel:%2B33%20%280%29%20971%20488%20459>
>             Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com> <mailto:bernard.vatant@__mondeca.com <mailto:bernard.vatant@mondeca.com>>
>
>             ------------------------------__----------------------
>             Mondeca
>             3, cité Nollez 75018 Paris France
>             Web: http://www.mondeca.com
>             Blog: http://mondeca.wordpress.com
>             ------------------------------__----------------------
>
>
>
>
>
>
>     --
>     Jon
>
>     I check email just a couple of times daily; to reach me sooner, click here: http://awayfind.com/jonphipps
>
>

Received on Monday, 27 June 2011 16:09:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 June 2011 16:09:54 GMT