Language tags and valu etetsing (was: Open world value tests)

Eric Prud'hommeaux wrote:
> On Thu, Aug 24, 2006 at 09:45:33PM +0100, Seaborne, Andy wrote:
>> """
>> ACTION AndyS:
>> Write some tests for value testing (unknown types and extensibility) to add 
>> to
>> 2006/JulSep0086
>> """
>>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0086
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0104
>>
. . .


>> Tests open-eq-07 to open-eq-10 work by taking a list of all possible term
>> forms, forming the cross product and seeing which are value-equal and
>> value-not-equal.  This is done for data which contains the same compared
>> values and different by comparable values.  These tests are exhaustive and
>> include literals with lang tags - because lang tags are not case sensitive 
>> (nor is there a canonical form according to RFC3066) it seemed reasonable 
>> to be able equate "xyz"@EN with "xyz"@en. In effect, each lang tag defines 
>> a separate value space - can't compare or test for equality across them, 
>> but you can with the same language.
>>
>> "abc"@en = "abc"@EN
>> "xyz"@en > "abc"@en
>> "xyz"@en > "abc"@EN
> 
> There is no current language for case-insensitive language tags in
> SPARQL presently. My implementation failed these both because of
> case-sensitive language matching, and because they employed extra
> operators not currently in SPARQL.

Is is just a matter of expanding the table to include RDF plain literals with 
language tags? ORDER BY defers to "<" if it can.

I tried writing things out from the current operations alone:

Some things can be written:
   ( lang(?x) = lang(?y) ) && str(?x) > str(?y)
but that only works cleanly for the same language tag - different would cause
false, not error which seems more natural and it's verbose.

langMatches isn't symmetric but I think:

   langMatches(lang(?x),lang(?y)) &&
   langMatches(lang(?y),lang(?x)) &&
   str(?x) > str(?y)

attempts to handle the case-sensitivity issue because a language tag is a 
special case of a language range.  It becomes more verbose though - ugh.    Or 
a regex.

"11.3.1 Operator Extensibility" could explicitly cover this - I can accept 
that language tag handling is an extension if there is text that states that. 
So far we have really been thinking of extension by datatypes.

 Andy

Received on Saturday, 21 October 2006 16:53:42 UTC