W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > April 2007

Re: [OK?] Re: [SPARQL] i18n comment: Modification in description of langMatches operator

From: Addison Phillips <addison@inter-locale.com>
Date: Wed, 25 Apr 2007 14:03:44 +0100
Message-ID: <462F51B0.6000300@inter-locale.com>
To: Eric Prud'hommeaux <eric@w3.org>
CC: Felix Sasaki <fsasaki@w3.org>, jjc@hpl.hp.com, public-rdf-dawg-comments@w3.org, public-i18n-core@w3.org

Eric Prud'hommeaux wrote:

>>>1. Language matching in RFC 4647 is defined in terms of "language 
>>>priority lists" made up of "language ranges". It may be useful to 
>>>incorporate this concept into SPARQL query. If necessary, you may 
>>>limit the list to a single range.
>>>      
>>>
>
>That is the intention. Multiple ones may be expressed as multiple
>langMatches tests:
>
>  FILTER (langMatches(lang(?x), "en") || langMatches(lang(?x), "es"))
>
>  
>

The problem I see with this is that implementations of matching may 
already be in terms of language priority lists. Also, note that the 
range can be an expression---taking its value, for example, from HTTP 
Accept-Language. Ideally I'd like to see a language priority list here.

>>>2. The special range "*" usually matches all language tags, including 
>>>the empty tag. If it didn't, you would have the problem of not being 
>>>able to select contents with no tag except explicitly. That is, to 
>>>select everything, you'd need two queries: one for "*" and one for the 
>>>empty tag. (Obviously, omitting the langmatches statement has the same 
>>>effect, so your current text may be by design??)
>>>      
>>>
>
>Yes, lang("abc") returns an empty string as giving type errors would
>make the language more cumbersome. The use case for looking for
>anything with a language tag drove langMatche("", "*") => false.
>  
>

Okay, that makes sense. But it should be documented clearly, since it 
isn't quite RFC 4647. This suggests, please note, something that I 
should take back to the LTRU WG at IETF (where 4647 is maintained).

>  
>
>>>3. You don't have a way of specifying the empty tag, or at least you 
>>>don't enumerate it. The empty tag only matches itself. That is:
>>>
>>>FILTER langMatches( lang(?title), "")
>>>
>>>only matches items with an xml:lang=""
>>>
>>>You should call this fact out.
>>>      
>>>
>
>RDF literals with empty language tags are treated as literals
>with no language tag.
>  http://www.w3.org/TR/rdf-syntax-grammar/#section-literal-node
>so <rdf:Description><some:predicate xml:lang="">abc</...></...>
>exactly equals 
>   <rdf:Description><some:predicate            >abc</...></...>
>
>  
>
Yes, but you have no way to select *only* the items with no language 
tag? ("*" is available to find any non-empty value).

I know that your examples are equal: I want to select those distinct from:

  <rdf:Description><some:predicate xml:lang="de">foo</...></...>


>[[
>Returns true if language-range (second argument) matches language-tag
>(first argument) according to the Basic Filter matching scheme in
>Matching of Language Tags [RFC 4647] Section 3.3.1. language-range is
>a basic language range per RFC 4647 Section 2.1. The special range "*"
>matches any non-empty language-tag string.
>]]
>
>I am content with either of these configurations, though slightly
>prefer the one just uttered. If you are content with this wording,
>please respond with a Subject: prefixed by "[CLOSED]". If not, let's
>negotiate some more.
>  
>
The wording is not the big issue to me. It's fine as long as technically 
correct: it's editorial and I'm not concerned about how you phrase it so 
much. I would reverse the range and tag in the first sentence (as a 
nit). Maybe the following (text in {{{}}} is optional per above):

--
  Returns true if the language-tag (first argument) matches the 
language-range {{{s in the language priority list}}} (second argument). 
The matching scheme is based on Basic Filtering from Matching of 
Language Tags [RFC 4647, Section 3.3.1], with some minor modifications. 
The special range "*" matches any non-empty language-tag string. Unlike 
in RFC 4647, it does not match the empty string. The empty range matches 
only items with an empty language-tag or lacking the language attribute 
altogether.
---



~Addison
Received on Wednesday, 25 April 2007 13:04:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:51 GMT