- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Mon, 23 Oct 2006 14:58:14 +0200
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Sat, Oct 21, 2006 at 05:53:29PM +0100, Seaborne, Andy wrote: > > > > Eric Prud'hommeaux wrote: > >On Thu, Aug 24, 2006 at 09:45:33PM +0100, Seaborne, Andy wrote: > >>""" > >>ACTION AndyS: > >>Write some tests for value testing (unknown types and extensibility) to > >>add to > >>2006/JulSep0086 > >>""" > >> > >>http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0086 > >>http://lists.w3.org/Archives/Public/public-rdf-dawg/2006AprJun/0104 > >> > . . . > > > >>Tests open-eq-07 to open-eq-10 work by taking a list of all possible term > >>forms, forming the cross product and seeing which are value-equal and > >>value-not-equal. This is done for data which contains the same compared > >>values and different by comparable values. These tests are exhaustive and > >>include literals with lang tags - because lang tags are not case > >>sensitive (nor is there a canonical form according to RFC3066) it seemed > >>reasonable to be able equate "xyz"@EN with "xyz"@en. In effect, each lang > >>tag defines a separate value space - can't compare or test for equality > >>across them, but you can with the same language. > >> > >>"abc"@en = "abc"@EN > >>"xyz"@en > "abc"@en > >>"xyz"@en > "abc"@EN This creates the interesting conundrum that something is simultaneously equivilent and greaterThan: "abc"@en = "abc"@EN ⇒ TRUE "abc"@en > "abc"@EN ⇒ TRUE (and "abc"@EN < "abc"@en ⇒ TRUE) I would favor < over =, but I guess that depends on your use cases. > >There is no current language for case-insensitive language tags in > >SPARQL presently. My implementation failed these both because of > >case-sensitive language matching, and because they employed extra > >operators not currently in SPARQL. > > Is is just a matter of expanding the table to include RDF plain literals > with language tags? ORDER BY defers to "<" if it can. I think "abc"@en > "abc"@EN is fully expressible with our current functions: (STR(?a) != STR(?b) && STR(?a) > STR(?b)) || (STR(?a) == STR(?b) && LANG(?a) > LANG(?b)) # isn't "a" > "A" wierd? If the above analysis is correct, one could add a shortcut syntax for in the operator mapping table. (note: simple literal > simple literal is currently in the table.): [[ ┃A > B│simple literal│simple literal│op:numeric-equal(fn:compare(A, B), 1) │xsd:boolean┃ + ┃A > B│plain literal │plain literal │logical-or( logical-and(fn:not(op:numeric-equal(fn:compare(str(A), str(B)), 0)), op:numeric-equal(fn:compare(lang(A), lang(B)), 1)), logical-and(op:numeric-equal(fn:compare(str(A), str(B)), 0), op:numeric-equal(fn:compare(str(A), str(B)), 1)))│xsd:boolean┃ ]] or one could add functions for each of < > <= >= ala: [[ + ┃A > B│plain literal │plain literal │RDFplainLiteral-greaterThan(A, B))│xsd:boolean┃ RDFplainLiteral-greaterThan xsd:boolean RDFplainLiteral-greaterThan (plain literal lit1, plain literal lit2) If the lexical values of lit1 and lit2 are identical, RDFplainLiteral-greaterThan TRUE or FALSE depending whether LANG(lit1) > LANG(lit2). If the lexical values are not identical, RDFplainLiteral-greaterThan TRUE or FALSE depending whether STR(lit1) > STR(lit2). ]] These specifications were assuming that you wanted this sort order: "abb" "abc" "abc"@EN "abc"@eN "abc"@En "abc"@en "abc"@en-fr # zis iss how we speak here "abd" > I tried writing things out from the current operations alone: > > Some things can be written: > ( lang(?x) = lang(?y) ) && str(?x) > str(?y) > but that only works cleanly for the same language tag - different would > cause > false, not error which seems more natural and it's verbose. > > langMatches isn't symmetric but I think: > > langMatches(lang(?x),lang(?y)) && > langMatches(lang(?y),lang(?x)) && > str(?x) > str(?y) > > attempts to handle the case-sensitivity issue because a language tag is a > special case of a language range. It becomes more verbose though - ugh. > Or a regex. REGEXP(LANG(?x), LANG(?y), 'i') > "11.3.1 Operator Extensibility" could explicitly cover this - I can accept > that language tag handling is an extension if there is text that states > that. So far we have really been thinking of extension by datatypes. [[ Extended SPARQL implementations may support additional associations between operators and operator functions; this amounts to adding rows to the table above. No additional operator support may yield a result that replaces any result other than a type error in an unextended implementation. ]] I think I've convinced myself that it's extendable this way. You are adding rows that replace the type errors you would get in an unextended implementation. These rules just make sure that you don't lose dawg:monotinicity over DAWG-specified parts of the language. Ideally, people won't step on each other's truth values too much, but I don't think we can say much about that. -- -eric home-office: +1.617.395.1213 (usually 900-2300 CET) +33.1.45.35.62.14 cell: +33.6.73.84.87.26 (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Monday, 23 October 2006 12:57:21 UTC