Re: [OK?] Re: [SPARQL] i18n comment: Modification in description of langMatches operator

(Forwarded to the lists)

Addison Phillips wrote:
> Eric Prud'hommeaux wrote:
>
>>>> 1. Language matching in RFC 4647 is defined in terms of "language 
>>>> priority lists" made up of "language ranges". It may be useful to 
>>>> incorporate this concept into SPARQL query. If necessary, you may 
>>>> limit the list to a single range.
>>>>     
>>
>> That is the intention. Multiple ones may be expressed as multiple
>> langMatches tests:
>>
>>  FILTER (langMatches(lang(?x), "en") || langMatches(lang(?x), "es"))
>>
>>  
>>
>
> The problem I see with this is that implementations of matching may 
> already be in terms of language priority lists. Also, note that the 
> range can be an expression---taking its value, for example, from HTTP 
> Accept-Language. Ideally I'd like to see a language priority list here.
>
>>>> 2. The special range "*" usually matches all language tags, 
>>>> including the empty tag. If it didn't, you would have the problem 
>>>> of not being able to select contents with no tag except explicitly. 
>>>> That is, to select everything, you'd need two queries: one for "*" 
>>>> and one for the empty tag. (Obviously, omitting the langmatches 
>>>> statement has the same effect, so your current text may be by 
>>>> design??)
>>>>     
>>
>> Yes, lang("abc") returns an empty string as giving type errors would
>> make the language more cumbersome. The use case for looking for
>> anything with a language tag drove langMatche("", "*") => false.
>>  
>>
>
> Okay, that makes sense. But it should be documented clearly, since it 
> isn't quite RFC 4647. This suggests, please note, something that I 
> should take back to the LTRU WG at IETF (where 4647 is maintained).
>
>>  
>>
>>>> 3. You don't have a way of specifying the empty tag, or at least 
>>>> you don't enumerate it. The empty tag only matches itself. That is:
>>>>
>>>> FILTER langMatches( lang(?title), "")
>>>>
>>>> only matches items with an xml:lang=""
>>>>
>>>> You should call this fact out.
>>>>     
>>
>> RDF literals with empty language tags are treated as literals
>> with no language tag.
>>  http://www.w3.org/TR/rdf-syntax-grammar/#section-literal-node
>> so <rdf:Description><some:predicate xml:lang="">abc</...></...>
>> exactly equals   <rdf:Description><some:predicate            
>> >abc</...></...>
>>
>>  
>>
> Yes, but you have no way to select *only* the items with no language 
> tag? ("*" is available to find any non-empty value).
>
> I know that your examples are equal: I want to select those distinct 
> from:
>
>  <rdf:Description><some:predicate xml:lang="de">foo</...></...>
>
>
>> [[
>> Returns true if language-range (second argument) matches language-tag
>> (first argument) according to the Basic Filter matching scheme in
>> Matching of Language Tags [RFC 4647] Section 3.3.1. language-range is
>> a basic language range per RFC 4647 Section 2.1. The special range "*"
>> matches any non-empty language-tag string.
>> ]]
>>
>> I am content with either of these configurations, though slightly
>> prefer the one just uttered. If you are content with this wording,
>> please respond with a Subject: prefixed by "[CLOSED]". If not, let's
>> negotiate some more.
>>  
>>
> The wording is not the big issue to me. It's fine as long as 
> technically correct: it's editorial and I'm not concerned about how 
> you phrase it so much. I would reverse the range and tag in the first 
> sentence (as a nit). Maybe the following (text in {{{}}} is optional 
> per above):
>
> -- 
>  Returns true if the language-tag (first argument) matches the 
> language-range {{{s in the language priority list}}} (second 
> argument). The matching scheme is based on Basic Filtering from 
> Matching of Language Tags [RFC 4647, Section 3.3.1], with some minor 
> modifications. The special range "*" matches any non-empty 
> language-tag string. Unlike in RFC 4647, it does not match the empty 
> string. The empty range matches only items with an empty language-tag 
> or lacking the language attribute altogether.
> ---
>
>
>
> ~Addison
>

Received on Wednesday, 25 April 2007 13:48:05 UTC