W3C home > Mailing lists > Public > public-swd-wg@w3.org > March 2009

RE: Request for feedback on SKOS Last Call Working Draft

From: Phillips, Addison <addison@amazon.com>
Date: Tue, 3 Mar 2009 08:25:50 -0800
To: "Ralph R. Swick" <swick@w3.org>, Antoine Isaac <aisaac@few.vu.nl>
CC: Alistair Miles <alistair.miles@zoo.ox.ac.uk>, Richard Ishida <ishida@w3.org>, "public-swd-wg@w3.org" <public-swd-wg@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, "'Felix Sasaki'" <fsasaki@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA019E1619B8@EX-SEA5-D.ant.amazon.com>
Hmm... I hadn't been paying attention to this thread, until just now. The following exchange about language tags disturbs me somewhat. One of the parts of IETF BCP 47 (the language tagging RFCs) describes language tag matching (RFC 4647). Unsurprisingly, there is more than one form of matching. For the sort you are describing below, the typical matching scheme is called "filtering" and the value supplied as the "range" (that is, in the triple) matches tags that are equal-to-or-longer-than the supplied value. That is, "en-GB" (en-UK is invalid) does not match "en" and neither does "en-US".

Section 5.6.5 in the SKOS last call document is not wrong; it just doesn't recognize one of the language tag matching schemes as described in BCP 47. Each different language tag is taken to be a different token. The problem that this might entail is that language tags are not always predictable. There exist a range of variation in a user's choice of subtags that one might wish to match without having prior knowledge of the full range of variation in the tags present in a document.

My suggestion would be to reference filtering in RFC 4647 as at least a permitted implementation choice. A triple like this:

ex:color skos:prefLabel "colour"@en ;
   skos:prefLabel "color"@en-US.

... would make all English tagged prefLabels spelled as "colour" save for US English tagged ones. Falling back from en-?? To en strikes me as a bad idea, by contrast, unless done explicitly by the user. Consider a more complex tag that conveys a lot of information: "zh-cmn-Hant-TW" (Chinese,Mandarin,traditional script, Taiwan). You don't really want it to match just any Chinese tag (or why use the big complicated one).


Addison Phillips
Globalization Architect -- Lab126
Editor -- IETF BCP 47

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> request@w3.org] On Behalf Of Ralph R. Swick
> Sent: Tuesday, March 03, 2009 6:29 AM
> To: Antoine Isaac
> Cc: Alistair Miles; Richard Ishida; public-swd-wg@w3.org; public-
> i18n-core@w3.org; 'Felix Sasaki'
> Subject: Re: Request for feedback on SKOS Last Call Working Draft
> At 02:22 PM 2/26/2009 +0100, Antoine Isaac wrote:
> >if an application does matching of en-UK and en-GB to en, then the
> following RDF triples:
> >
> >ex:color skos:prefLabel "color"@en-US ;
> >   skos:prefLabel "colour"@en-GB.
> >
> >entail:
> >
> >ex:color skos:prefLabel "color"@en ;
> >   skos:prefLabel "colour"@en.
> I believe you're making an application-specific choice here.
> Where in the SKOS data model (spec) is this entailment
> endorsed?  I could imagine an application that may find it
> convenient to implement language searching by acting as
> if your example were endorsed but it doesn't feel appropriate
> to me in general to state such an entailment.
> >This is incompatible with the SKOS specifications for prefLabel
> [2].
> Which is one of the reasons it's an inappropriate entailment :)
> >[2] http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/#L1567


Received on Tuesday, 3 March 2009 16:26:33 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:31:56 UTC