- From: Jon Phipps <jphipps@madcreek.com>
- Date: Tue, 28 Jun 2011 06:58:20 -0500
- To: Antoine Isaac <aisaac@few.vu.nl>
- Cc: public-esw-thes@w3.org
- Message-ID: <BANLkTimTLz23Vn0abp0ANbLcrNW-2ZhaJw@mail.gmail.com>
Hi Antoine, The Metadata Registry requires a language attribute on labels. What we haven't decided is what we should do when aggregating data from other 'systems' that don't take a preemptive approach to conformance. I recognize the difference between: ex:foo skos:prefLabel 'color'; ex:foo skos:prefLabel 'color'@en; ex:foo skos:prefLabel 'color'@en-US; ...but would strongly prefer that a system recognize that these represent successive refinements of a single label for the purpose of testing conformance, although I wonder how often that when evaluating data in the aggregate such conformance tests are likely to be useful. OTOH, we do have to wonder what we would return if a REST request asked for a representation of a concept in language 'en' from an aggregated store. But I think that's another discussion entirely. I think a dedicated page on the SKOS wiki might be very useful in the light of this discussion, perhaps once there's some response from rdf-wg? Jon On Mon, Jun 27, 2011 at 11:12 AM, Antoine Isaac <aisaac@few.vu.nl> wrote: > Hi, > > Yes, the issue seems to be really tricky, between what happens at the > syntax level, and what happens at the model level. Personally I'll wait till > I receive some feedback from the RDF group--my email is still blocked btw. > > Now, the issue of whether we should recommend or require language tags, as > Jon, Bernard and Armando suggest (in different fashions), remains. Again my > take is to go for softer, best practice-kind of encouragement: in > particular, sending messages related to the notion of "conformance" seems > quite strong [1]. > If instead available SKOS reference tooling (just thinking of PoolParty's > or Mondeca's tools) include a warning, when tag-less labels are found, this > would help sending an appropriate message, I think! Hmm, maybe you've > already implemented it... And maybe there's something in the Metadata > Registry that entices the user to assign a languagetag as well? > > One may think of adding some sentences to the specification document, > making soft recommendations. But changing published W3C recs is nearly > impossible, and errata are usually used for "bugs" in the specs, only, like > the potential one for S14. Plus, the docs are already making some effort in > that direction: the SKOS Primer has only one example of tag-less literal, > and it's in a very specific context (notations)... > > If we want really to centralize best practice, then anyone is free to use > the SKOS wiki and create a dedicated page, gathering all these softer > "warnings" a reasoner could issue when ingesting SKOS data. If it gets > decent consensus (and stability), we could even easily port it to > http://www.w3.org/2004/02/**skos/ <http://www.w3.org/2004/02/skos/>, under > a "best practices" heading: that site is not a formal W3C recommendation > document. > > Cheers, > > Antoine > > [1] http://www.w3.org/TR/2009/REC-**skos-reference-20090818/#L434<http://www.w3.org/TR/2009/REC-skos-reference-20090818/#L434> > > > >> On Mon, Jun 27, 2011 at 8:04 AM, Jon Phipps <jphipps@madcreek.com<mailto: >> jphipps@madcreek.com>> wrote: >> >> Hi Antoine, >> >> +1, I think, sortof, maybe. :-) >> It depends a bit on what you're saying. >> >> If we take the Open World assumption of the RDF data model into >> consideration, then it would seem reasonable to state to a _reasoner_ that a >> skos:prefLabel _must_ have a language tag, particularly given the intent of >> [S14], even if that language tag is currently unknown. Using Bernard's >> excellent example, this would imply to me at least that the 'conformance' of >> the following can't be determined without more information: >> ex:foo skos:prefLabel 'A'; prefLabel 'B'@en >> >> And that the following isn't redundant, but rather supplies that >> information: >> ex:foo skos:prefLabel 'A'; prefLabel 'A'@en >> >> >> Unfortunately this isn't the case. There is no syntax for partially >> specifying these data values. So the model theory has these as two labels. >> >> >> I think this is a somewhat separate issue from the one that you raised >> with the RDF folks: >> _If_ the specification _requires_ a language tag in order to determine >> conformance with [S14], does this: >> ex:foo skos:prefLabel 'A' >> infer this: >> ex:foo skos:prefLabel 'A'@"" >> >> >> As I pointed out, the latter isn't valid, as the language tag needs to be >> one specified in BCP47. >> >> <foo xml:lang="">bar</foo> does not mean the value of the foo element is >> 'bar'@"". I means the value of the foo element is 'bar' (without language >> tag). >> >> Realize that the parsing of the syntax does not translate into what you >> think would be the obvious translation. Perhaps recognizing that the >> transformation parsetype attribute does a non obvious transformation will >> help help emphasize that care should be made in understanding the difference >> between what you see in a particular concrete syntax versus was is read into >> the model. >> >> -Alan >> >> >> If that is the case, that would transform this: >> ex:foo skos:prefLabel 'A'; prefLabel 'B'@en >> into this: >> ex:foo skos:prefLabel 'A'@""; prefLabel 'B'@en >> which is conformant with [S14], and this: >> ex:foo skos:prefLabel 'A'@""; prefLabel 'A'@en >> which is not conformant, _unless_ you consider that >> ex:foo skos:prefLabel 'A'@en >> is a higher-value, more refined replacement for >> ex:foo skos:prefLabel 'A'@"" >> >> Bernard's refinement of the rule would seem to be an >> application-specific case, even though I think that rule of interpreting an >> empty language tag to mean 'all' or 'any' language rather than 'no language' >> is highly useful best practice. His rule has value in determining which >> labels to display or which concepts to return from a search, but this is >> slightly different than discussing conformance to [S14]. >> >> I hope you get an answer from the rdf-wg, but I agree with you that >> what constitutes 'acceptable' data, especially when aggregating data from >> disparate systems should be broadly defined even if that is somewhat >> different than what defines 'conformance'. Postel's robust principle: "be >> liberal in what you accept; be conservative in what you send" provide's >> useful guidance. >> >> Jon >> >> On Fri, Jun 24, 2011 at 7:07 AM, Antoine Isaac <aisaac@few.vu.nl<mailto: >> aisaac@few.vu.nl>> wrote: >> >> Hi Armando, Bernard, >> >> SKOS indeed encourages the use of language-tagged labels. This is >> why almost all examples in the doc have language tags, and probably the >> reason for which we now have to make S14 clearer--cf. our other discussion >> now. >> >> But we also have to remain simple, and compatible with a wide range >> of data. For many vocabularies, publishing language info is technically >> difficult, or even impossible. This is especially the case for vocabularies >> that have been aggregating labels originating from different languages, but >> with data structures that do not allow (or make difficult) to track language >> provenance. >> >> Cheers, >> >> Antoine >> >> >> Hi all, >> >> agree with Bernard. >> >> Even more, for how much it can seem restrictive (and possibly >> causing huge panic for retrocompatibility with huge amount of existing data, >> but every revolution has its heads chopped off…), I would think of a >> revision of SKOS as **really** suggesting not to use (forbidding?) >> prefLabels with no language tag. One of the SKOS objectives was to give a >> decent coverage of the linguistic descriptions of concept schemes (and >> ontologies in general, as prefLabel is now an AnnotationProperty [S10] thus >> admitting any resource in its domain), and thus a prefLabel with no language >> tag makes no sense to me. One could say that plainLiterals could be used >> with no langtag to address specific codes related to no natural language, >> but there are better options for that (i.e. skos:notation). >> >> In my experience, I’ve always had to make-do somehow with >> missing lang tags, because usually those values still are explained in some >> language, so you have to know it in advance, or guess it…so, lot of patches >> to any software ever written for natural language querying over ontologies, >> to account for the language assumed to be used for no-langtagged-literals. >> Collapsing indexes for no-lang-tags with lang-tags of the same language etc… >> >> This is a dirty work to be done when dealing with rdfs:label, >> but an highly specified (and specific) property as prefLabel could surely >> better live without “no-lang-tagged” plainLiterals. >> >> Armando >> >> *From:* public-esw-thes-request@w3.org <mailto: >> public-esw-thes-**request@w3.org <public-esw-thes-request@w3.org>> >> [mailto:public-esw-thes-__**request@w3.org<public-esw-thes-__request@w3.org><mailto: >> public-esw-thes-**request@w3.org <public-esw-thes-request@w3.org>>] *On >> Behalf Of *Bernard Vatant >> >> *Sent:* Friday, June 24, 2011 11:12 AM >> *To:* Antoine Isaac >> *Cc:* public-esw-thes@w3.org <mailto:public-esw-thes@w3.org**> >> >> *Subject:* Re: skos:prefLabel without language tag >> >> Hello all >> >> Thinking further about it, beyond the formal issue we have the >> question of the expected behaviour of applications when meeting labels w/o >> language tags. >> >> In multilingual environments, the language tag is typically >> used to present the concept to end users in their "user language". The >> unicity of the prefLabel in the user language avoids clashes in the >> interface. Note that some systems (e.g., Eurovoc and other OPOCE >> vocabularies) even require that all concepts have a prefLabel in all >> supported user languages (e.g., EU official languages), including default >> value rules (such as take the English label if no label is available in >> Slovenian or Swedish). >> >> In our (Mondeca ITM) system, a label (aka "name") has also a >> mandatory and unique language tag, but one possible value is "no language". >> The behaviour of the system regarding this tag is that such names are >> displayed whatever the user language choice. Of course if one wants unicity >> of the displayed name, it implies that if there is a "no language" name, >> there is no (other) name tagged with a language. >> >> Translated in SKOS, this rule would look like : >> >> *If a Concept has a prefLabel value with no language tag, it >> cannot have a different prefLabel value with a language tag.* >> >> IOW the following is not conformant >> ex:foo skos:prefLabel 'A'; prefLabel 'B'@en >> >> The following is conformant but somehow redundant >> ex:foo skos:prefLabel 'A'; prefLabel 'A'@en >> >> Bernard >> >> 2011/6/23 Antoine Isaac <aisaac@few.vu.nl <mailto: >> aisaac@few.vu.nl> <mailto:aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>>> >> >> >> >> On 6/23/11 8:40 PM, Alan Ruttenberg wrote: >> >> On Thu, Jun 23, 2011 at 1:52 PM, Houghton,Andrew<houghtoa@oclc. >> **__org <mailto:houghtoa@oclc.org> <mailto:houghtoa@oclc.org <mailto: >> houghtoa@oclc.org>>> wrote: >> >> Given these two situations: >> >> >> >> <skos:prefLabel>Dog</skos:__**prefLabel> >> >> <skos:prefLabel xml:lang=””>Dog</skos:__**prefLabel> >> >> Does the inclusion of *both* prefLabel in a SKOS concept result >> in breaking >> the rule S14 that no two prefLabel should have the same lexical >> value for >> the same language tag? >> >> >> My read is that S14 is not applicable. In both cases the >> lexical value >> is the same - a plain literal without language tag. The RDFXML >> doesn't >> state that the language tag is "". It is syntax for the absence >> of a >> language tag. These two are different in the value space - >> without a >> language tag it is a string, with a language tag it is a pair >> of >> strings. The set of plain literals without language tags is >> *not* the >> set of pairs (string , ""). >> >> Since the rule as stated applies to literals *with* language >> tags >> (they can't be the same unless they are there), S14 would not >> seem to >> be applicable. >> >> That said, this looks like a hole in the spec. It was probably >> the >> intention to also include the case that no two prefLabel >> without >> language tag have the same lexical value. >> >> -Alan >> >> Yes, it certainly was. >> >> I have to admit I don't know if there is a hole. It may seem >> reasonable that there exist some syntactic matching between literals having >> an empty tag and literals having no tag, as Simon reports. >> >> >> >> I think section 6.12 of the rdf syntax spec does result in the >> defaulting of language to at least "" in production 7.2.16- there doesn't >> seem to be another literal production that passes the language feature. I >> must admit that I am not certain how general this assumption is- there are >> other specs that seem to distinguish between <s> and <s,l>, but I think only >> <s> \equiv <s,""> is consistent? >> >> Simon >> >> However, this may be specific to one syntax. >> The RDF abstract syntax and other specs are not mentioning that >> sort of things. Especially, the way the identity conditions are spelled out >> at [1,2] seem to argue against amalgamating absence of tag with presence of >> any tag (including an empty one). >> >> Anyway, it could be that the simplest thing to do is to publish >> an erratum to clarify the original intent, rather than go into a discussion >> that is difficult, and would perhaps just be against a moving target, as RDF >> is currently being worked on... I'll forward the issue. >> >> Cheers, >> >> Antoine >> >> [1]http://www.w3.org/TR/rdf-__**concepts/#section-Literal-__** >> Equality<http://www.w3.org/TR/rdf-__concepts/#section-Literal-__Equality>< >> http://www.w3.org/TR/rdf-**concepts/#section-Literal-**Equality<http://www.w3.org/TR/rdf-concepts/#section-Literal-Equality> >> > >> [2] http://www.w3.org/TR/rdf-__** >> plain-literal/#The_Comparison_**__of_rdf:PlainLiteral_Data___**Values<http://www.w3.org/TR/rdf-__plain-literal/#The_Comparison___of_rdf:PlainLiteral_Data___Values>< >> http://www.w3.org/TR/rdf-**plain-literal/#The_Comparison_** >> of_rdf:PlainLiteral_Data_**Values<http://www.w3.org/TR/rdf-plain-literal/#The_Comparison_of_rdf:PlainLiteral_Data_Values> >> > >> >> >> >> >> >> -- >> Bernard Vatant >> Senior Consultant >> Vocabulary & Data Integration >> Tel: +33 (0) 971 488 459 <tel:%2B33%20%280%29%20971%** >> 20488%20459> >> Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@** >> mondeca.com <bernard.vatant@mondeca.com>> <mailto:bernard.vatant@__monde* >> *ca.com <http://mondeca.com> <mailto:bernard.vatant@**mondeca.com<bernard.vatant@mondeca.com> >> >> >> >> >> ------------------------------**__---------------------- >> Mondeca >> 3, cité Nollez 75018 Paris France >> Web: http://www.mondeca.com >> Blog: http://mondeca.wordpress.com >> ------------------------------**__---------------------- >> >> >> >> >> >> >> -- >> Jon >> >> I check email just a couple of times daily; to reach me sooner, click >> here: http://awayfind.com/jonphipps >> >> >> > > -- Jon I check email just a couple of times daily; to reach me sooner, click here: http://awayfind.com/jonphipps
Received on Tuesday, 28 June 2011 12:04:32 UTC