- From: Antoine Isaac <aisaac@few.vu.nl>
- Date: Wed, 04 Mar 2009 17:59:40 +0000
- To: "Phillips, Addison" <addison@amazon.com>
- CC: Alistair Miles <alistair.miles@zoo.ox.ac.uk>, "Ralph R. Swick" <swick@w3.org>, Richard Ishida <ishida@w3.org>, "public-swd-wg@w3.org" <public-swd-wg@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, 'Felix Sasaki' <fsasaki@w3.org>
Hi Addison, Thanks for the explanation, which makes a bit clear what I had understood from [1]: "Matching different language tags is important for a number of applications. According to BCP 47 'en' can be said to match 'en-GB'." If I understand well, there are applications that could do this filtering, and if they use data which was not intended for filtering (that is, data including language tag variation, because their original context of application was concerned with that), then there could be trouble. But maybe this is not so much trouble in fact: that kind of matching does not amount to producing new RDF data (in your example, a new triple ex:walkingPath skos:prefLabel "sidewalk"@en. ), does it? If the data stays the same, and if as you say it is technically valid, then there is no possible inconsistency with what the SKOS model specifies. Best, Antoine [1] http://www.w3.org/International/articles/language-tags/ > Hello Alistair, > > Thanks for the note back. > > I'm aware of the SPARQL function: I helped the WG craft the text about it. The query function might turn out to be a problem and I may not have given the right feedback in my last email. Let me explain. > > My concern is that, if you have a triple like: > > ex:walkingPath rdf:type skos:Concept; > skos:prefLabel "sidewalk"@en-US; > skos:prefLabel "pavement"@en > > ... then SKOS rightly asserts that "en" and "en-US" are different languages exclusive of one another. This implies that one must include a separate prefLabel for every possible language tag variation one wishes to support. This is not generally the intention when applying language tags. > > So my example doesn't say whether the label for "en" covers a user who speaks "en-GB" or "en-AU" or "en-NZ" (for example). Those are all different languages not specified. Typically, a request for the label from the SKOS description of an ontology will contain the user's fully qualified language preference--that is, they are specifying the MOST information that they care to provide about their language. The matching scheme in RFC 4647 for that is called "lookup" and it falls back (a request for "en-GB" in my example would find "pavement", labeled as "en"). That is, a SKOS file contains what we I18N folks would call a "resource bundle" or "message catalog". > > In any case, SKOS is technically correct, but I think my advice would be to add some note clarifying that a natural language label defined in SKOS should be considered to apply to any request not masked by some other label. It is possible but very difficult to construct using SPARQL langMatches, whose purpose is actually different. > > So I guess I'd request notes in the Reference and Primer clarifying that, although (for example) "en" and "en-US" are considered to be different, one may consider a shorter language tag that is a "prefix" (by language tag standards) to match a longer "language range" in a request. That is, you don't need to supply "en-AU" if it is not different from "en". > > Regards, > > Addison > > Addison Phillips > Globalization Architect -- Lab126 > > Internationalization is not a feature. > It is an architecture. > > >> -----Original Message----- >> From: Alistair Miles [mailto:alistair.miles@zoo.ox.ac.uk] >> Sent: Wednesday, March 04, 2009 4:27 AM >> To: Phillips, Addison >> Cc: Ralph R. Swick; Antoine Isaac; Richard Ishida; public-swd- >> wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki' >> Subject: Re: Request for feedback on SKOS Last Call Working Draft >> >> Dear Addison, >> >> Thanks for this. Just to make sure I'm completely clear, are you >> suggesting we add a note to the SKOS Reference and/or SKOS Primer >> regarding the basic filtering scheme defined in RFC4647? What >> exactly >> would you suggest we say about it? >> >> I note that the SPARQL query language defines a function >> langMatches >> [1] which is supposed to implement the RFC4647 filtering scheme. >> >> Kind regards, >> >> Alistair >> >> [1] http://www.w3.org/TR/rdf-sparql-query/#func-langMatches >> >> On Tue, Mar 03, 2009 at 08:25:50AM -0800, Phillips, Addison wrote: >>> Hmm... I hadn't been paying attention to this thread, until just >> now. The following exchange about language tags disturbs me >> somewhat. One of the parts of IETF BCP 47 (the language tagging >> RFCs) describes language tag matching (RFC 4647). Unsurprisingly, >> there is more than one form of matching. For the sort you are >> describing below, the typical matching scheme is called "filtering" >> and the value supplied as the "range" (that is, in the triple) >> matches tags that are equal-to-or-longer-than the supplied value. >> That is, "en-GB" (en-UK is invalid) does not match "en" and neither >> does "en-US". >>> Section 5.6.5 in the SKOS last call document is not wrong; it >> just doesn't recognize one of the language tag matching schemes as >> described in BCP 47. Each different language tag is taken to be a >> different token. The problem that this might entail is that >> language tags are not always predictable. There exist a range of >> variation in a user's choice of subtags that one might wish to >> match without having prior knowledge of the full range of variation >> in the tags present in a document. >>> My suggestion would be to reference filtering in RFC 4647 as at >> least a permitted implementation choice. A triple like this: >>> ex:color skos:prefLabel "colour"@en ; >>> skos:prefLabel "color"@en-US. >>> >>> ... would make all English tagged prefLabels spelled as "colour" >> save for US English tagged ones. Falling back from en-?? To en >> strikes me as a bad idea, by contrast, unless done explicitly by >> the user. Consider a more complex tag that conveys a lot of >> information: "zh-cmn-Hant-TW" (Chinese,Mandarin,traditional script, >> Taiwan). You don't really want it to match just any Chinese tag (or >> why use the big complicated one). >>> Regards, >>> >>> Addison Phillips >>> Globalization Architect -- Lab126 >>> Editor -- IETF BCP 47 >>> >>> Internationalization is not a feature. >>> It is an architecture. >>> >>> >>>> -----Original Message----- >>>> From: public-i18n-core-request@w3.org [mailto:public-i18n-core- >>>> request@w3.org] On Behalf Of Ralph R. Swick >>>> Sent: Tuesday, March 03, 2009 6:29 AM >>>> To: Antoine Isaac >>>> Cc: Alistair Miles; Richard Ishida; public-swd-wg@w3.org; >> public- >>>> i18n-core@w3.org; 'Felix Sasaki' >>>> Subject: Re: Request for feedback on SKOS Last Call Working >> Draft >>>> At 02:22 PM 2/26/2009 +0100, Antoine Isaac wrote: >>>>> if an application does matching of en-UK and en-GB to en, then >> the >>>> following RDF triples: >>>>> ex:color skos:prefLabel "color"@en-US ; >>>>> skos:prefLabel "colour"@en-GB. >>>>> >>>>> entail: >>>>> >>>>> ex:color skos:prefLabel "color"@en ; >>>>> skos:prefLabel "colour"@en. >>>> I believe you're making an application-specific choice here. >>>> Where in the SKOS data model (spec) is this entailment >>>> endorsed? I could imagine an application that may find it >>>> convenient to implement language searching by acting as >>>> if your example were endorsed but it doesn't feel appropriate >>>> to me in general to state such an entailment. >>>> >>>>> This is incompatible with the SKOS specifications for >> prefLabel >>>> [2]. >>>> >>>> Which is one of the reasons it's an inappropriate entailment :) >>>> >>>>> [2] >> http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/#L1567 >> -- >> Alistair Miles >> Senior Computing Officer >> Image Bioinformatics Research Group >> Department of Zoology >> The Tinbergen Building >> University of Oxford >> South Parks Road >> Oxford >> OX1 3PS >> United Kingdom >> Web: http://purl.org/net/aliman >> Email: alistair.miles@zoo.ox.ac.uk >> Tel: +44 (0)1865 281993
Received on Wednesday, 4 March 2009 18:00:17 UTC