W3C home > Mailing lists > Public > public-swd-wg@w3.org > March 2009

Re: Request for feedback on SKOS Last Call Working Draft

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Fri, 06 Mar 2009 13:55:55 +0100
Message-ID: <49B11D5B.4040003@few.vu.nl>
To: "Phillips, Addison" <addison@amazon.com>
CC: Alistair Miles <alistair.miles@zoo.ox.ac.uk>, "Ralph R. Swick" <swick@w3.org>, Richard Ishida <ishida@w3.org>, "public-swd-wg@w3.org" <public-swd-wg@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, 'Felix Sasaki' <fsasaki@w3.org>
Hi Addison,

It makes sense!

Antoine

> Hi Antoine,
> 
> Yes, as I said the SKOS model is technically correct, accurate, and complete. The issue is what users and implementations do with it. I think the main concern I have is that SKOS Reference makes quite clear that you can have multiple labels with related-but-not-identical language tags. It is just that, having gone out of its way to say that 'en' != 'en-US', it doesn't further clarify that the presence of an 'en' tag is allowed imply a match with e.g. 'en-AU' or 'en-NZ', if the latter are not provided as distinct labels.
> 
> Does that make sense?
> 
> Addison
> 
> Addison Phillips
> Globalization Architect -- Lab126
> 
> Internationalization is not a feature.
> It is an architecture.
> 
> 
>> -----Original Message-----
>> From: Antoine Isaac [mailto:aisaac@few.vu.nl]
>> Sent: Wednesday, March 04, 2009 10:00 AM
>> To: Phillips, Addison
>> Cc: Alistair Miles; Ralph R. Swick; Richard Ishida; public-swd-
>> wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki'
>> Subject: Re: Request for feedback on SKOS Last Call Working Draft
>>
>> Hi Addison,
>>
>> Thanks for the explanation, which makes a bit clear what I had
>> understood from [1]:
>> "Matching different language tags is important for a number of
>> applications. According to BCP 47 'en' can be said to match 'en-
>> GB'."
>>
>> If I understand well, there are applications that could do this
>> filtering, and if they use data which was not intended for
>> filtering (that is, data including language tag variation, because
>> their original context of application was concerned with that),
>> then there could be trouble.
>>
>> But maybe this is not so much trouble in fact: that kind of
>> matching does not amount to producing new RDF data (in your example,
>> a new triple ex:walkingPath skos:prefLabel "sidewalk"@en. ), does
>> it?
>> If the data stays the same, and if as you say it is technically
>> valid, then there is no possible inconsistency with what the SKOS
>> model specifies.
>>
>> Best,
>>
>> Antoine
>>
>> [1] http://www.w3.org/International/articles/language-tags/
>>
>>
>>> Hello Alistair,
>>>
>>> Thanks for the note back.
>>>
>>> I'm aware of the SPARQL function: I helped the WG craft the text
>> about it. The query function might turn out to be a problem and I
>> may not have given the right feedback in my last email. Let me
>> explain.
>>> My concern is that, if you have a triple like:
>>>
>>> ex:walkingPath rdf:type skos:Concept;
>>>   skos:prefLabel "sidewalk"@en-US;
>>>   skos:prefLabel "pavement"@en
>>>
>>> ... then SKOS rightly asserts that "en" and "en-US" are different
>> languages exclusive of one another. This implies that one must
>> include a separate prefLabel for every possible language tag
>> variation one wishes to support. This is not generally the
>> intention when applying language tags.
>>> So my example doesn't say whether the label for "en" covers a
>> user who speaks "en-GB" or "en-AU" or "en-NZ" (for example). Those
>> are all different languages not specified. Typically, a request for
>> the label from the SKOS description of an ontology will contain the
>> user's fully qualified language preference--that is, they are
>> specifying the MOST information that they care to provide about
>> their language. The matching scheme in RFC 4647 for that is called
>> "lookup" and it falls back (a request for "en-GB" in my example
>> would find "pavement", labeled as "en"). That is, a SKOS file
>> contains what we I18N folks would call a "resource bundle" or
>> "message catalog".
>>> In any case, SKOS is technically correct, but I think my advice
>> would be to add some note clarifying that a natural language label
>> defined in SKOS should be considered to apply to any request not
>> masked by some other label. It is possible but very difficult to
>> construct using SPARQL langMatches, whose purpose is actually
>> different.
>>> So I guess I'd request notes in the Reference and Primer
>> clarifying that, although (for example) "en" and "en-US" are
>> considered to be different, one may consider a shorter language tag
>> that is a "prefix" (by language tag standards) to match a longer
>> "language range" in a request. That is, you don't need to supply
>> "en-AU" if it is not different from "en".
>>> Regards,
>>>
>>> Addison
>>>
>>> Addison Phillips
>>> Globalization Architect -- Lab126
>>>
>>> Internationalization is not a feature.
>>> It is an architecture.
>>>
>>>
>>>> -----Original Message-----
>>>> From: Alistair Miles [mailto:alistair.miles@zoo.ox.ac.uk]
>>>> Sent: Wednesday, March 04, 2009 4:27 AM
>>>> To: Phillips, Addison
>>>> Cc: Ralph R. Swick; Antoine Isaac; Richard Ishida; public-swd-
>>>> wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki'
>>>> Subject: Re: Request for feedback on SKOS Last Call Working
>> Draft
>>>> Dear Addison,
>>>>
>>>> Thanks for this. Just to make sure I'm completely clear, are you
>>>> suggesting we add a note to the SKOS Reference and/or SKOS
>> Primer
>>>> regarding the basic filtering scheme defined in RFC4647? What
>>>> exactly
>>>> would you suggest we say about it?
>>>>
>>>> I note that the SPARQL query language defines a function
>>>> langMatches
>>>> [1] which is supposed to implement the RFC4647 filtering scheme.
>>>>
>>>> Kind regards,
>>>>
>>>> Alistair
>>>>
>>>> [1] http://www.w3.org/TR/rdf-sparql-query/#func-langMatches
>>>>
>>>> On Tue, Mar 03, 2009 at 08:25:50AM -0800, Phillips, Addison
>> wrote:
>>>>> Hmm... I hadn't been paying attention to this thread, until
>> just
>>>> now. The following exchange about language tags disturbs me
>>>> somewhat. One of the parts of IETF BCP 47 (the language tagging
>>>> RFCs) describes language tag matching (RFC 4647). Unsurprisingly,
>>>> there is more than one form of matching. For the sort you are
>>>> describing below, the typical matching scheme is called
>> "filtering"
>>>> and the value supplied as the "range" (that is, in the triple)
>>>> matches tags that are equal-to-or-longer-than the supplied value.
>>>> That is, "en-GB" (en-UK is invalid) does not match "en" and
>> neither
>>>> does "en-US".
>>>>> Section 5.6.5 in the SKOS last call document is not wrong; it
>>>> just doesn't recognize one of the language tag matching schemes
>> as
>>>> described in BCP 47. Each different language tag is taken to be
>> a
>>>> different token. The problem that this might entail is that
>>>> language tags are not always predictable. There exist a range of
>>>> variation in a user's choice of subtags that one might wish to
>>>> match without having prior knowledge of the full range of
>> variation
>>>> in the tags present in a document.
>>>>> My suggestion would be to reference filtering in RFC 4647 as at
>>>> least a permitted implementation choice. A triple like this:
>>>>> ex:color skos:prefLabel "colour"@en ;
>>>>>    skos:prefLabel "color"@en-US.
>>>>>
>>>>> ... would make all English tagged prefLabels spelled as
>> "colour"
>>>> save for US English tagged ones. Falling back from en-?? To en
>>>> strikes me as a bad idea, by contrast, unless done explicitly by
>>>> the user. Consider a more complex tag that conveys a lot of
>>>> information: "zh-cmn-Hant-TW" (Chinese,Mandarin,traditional
>> script,
>>>> Taiwan). You don't really want it to match just any Chinese tag
>> (or
>>>> why use the big complicated one).
>>>>> Regards,
>>>>>
>>>>> Addison Phillips
>>>>> Globalization Architect -- Lab126
>>>>> Editor -- IETF BCP 47
>>>>>
>>>>> Internationalization is not a feature.
>>>>> It is an architecture.
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: public-i18n-core-request@w3.org [mailto:public-i18n-
>> core-
>>>>>> request@w3.org] On Behalf Of Ralph R. Swick
>>>>>> Sent: Tuesday, March 03, 2009 6:29 AM
>>>>>> To: Antoine Isaac
>>>>>> Cc: Alistair Miles; Richard Ishida; public-swd-wg@w3.org;
>>>> public-
>>>>>> i18n-core@w3.org; 'Felix Sasaki'
>>>>>> Subject: Re: Request for feedback on SKOS Last Call Working
>>>> Draft
>>>>>> At 02:22 PM 2/26/2009 +0100, Antoine Isaac wrote:
>>>>>>> if an application does matching of en-UK and en-GB to en,
>> then
>>>> the
>>>>>> following RDF triples:
>>>>>>> ex:color skos:prefLabel "color"@en-US ;
>>>>>>>   skos:prefLabel "colour"@en-GB.
>>>>>>>
>>>>>>> entail:
>>>>>>>
>>>>>>> ex:color skos:prefLabel "color"@en ;
>>>>>>>   skos:prefLabel "colour"@en.
>>>>>> I believe you're making an application-specific choice here.
>>>>>> Where in the SKOS data model (spec) is this entailment
>>>>>> endorsed?  I could imagine an application that may find it
>>>>>> convenient to implement language searching by acting as
>>>>>> if your example were endorsed but it doesn't feel appropriate
>>>>>> to me in general to state such an entailment.
>>>>>>
>>>>>>> This is incompatible with the SKOS specifications for
>>>> prefLabel
>>>>>> [2].
>>>>>>
>>>>>> Which is one of the reasons it's an inappropriate entailment :)
>>>>>>
>>>>>>> [2]
>>>> http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/#L1567
>>>> --
>>>> Alistair Miles
>>>> Senior Computing Officer
>>>> Image Bioinformatics Research Group
>>>> Department of Zoology
>>>> The Tinbergen Building
>>>> University of Oxford
>>>> South Parks Road
>>>> Oxford
>>>> OX1 3PS
>>>> United Kingdom
>>>> Web: http://purl.org/net/aliman
>>>> Email: alistair.miles@zoo.ox.ac.uk
>>>> Tel: +44 (0)1865 281993
> 
Received on Friday, 6 March 2009 12:56:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 6 March 2009 12:56:33 GMT