- From: Andy Seaborne <andy@apache.org>
- Date: Fri, 28 Jul 2023 16:22:16 +0100
- To: public-rdf-star-wg@w3.org
On 28/07/2023 15:06, Pierre-Antoine Champin wrote: > > On 27/07/2023 13:15, Andy Seaborne wrote: >> On 27/07/2023 10:37, Pierre-Antoine Champin wrote: >>> >>> On 21/07/2023 21:59, Peter F. Patel-Schneider wrote: >>>> As far as I can tell, >>>> >>>> :a :h "x"@EN {| :accordingTo :e |} . >>>> >>>> does not entail >>>> >>>> :a :h "x"@en {| :accordingTo :e |} . >>>> >>>> in the community group semantics, even if the underlying semantics >>>> is the RDFS semantics. >> >> This is covered in D-entailment. >> >> https://www.w3.org/TR/rdf12-semantics/#D_interpretations >> >> so it is similar to the case of "5"^^xsd:integer and "05"^^xsd:integer. > > Since literals in quoted triples are opaque in the CG report, > D-entailment does not "fix", as illustrated in Example 38: > > https://www.w3.org/2021/12/rdf-star.html#ref-opacity-annotation Right - it is the same situation as the numeric one previously mentioned in the CG. I'm not claiming it fixes it (it doesn't!). >> The difficulty I have is why deal with language tags one way and XSD >> numbers another way. RDF Concepts, which mentions "Core types" >> xsd:decimal and xsd:integer. ... > Note that this does not prevent implementation to preserve the original > case (e.g. "en-US") to respect users preferences. When retrieving from storage you don't know if the data is in the results, or if it's going to be tested, or both. Examples below. >>> Gregg's PR #48 on rdf12-concepts fixes this [1] by making the >>> conversion to lower case part of the comparison for term equality. >> >> A change here is one that will affect existing stored data. But if we >> could solve this once and for all, that would be good. > > Will it, though? I ran the following SPARQL queries on a number of > implementations : > > SELECT (sameTerm("a"@en, "a"@EN) as ?test) {} > > and all of them (Jena, RDFlib (python), Ruby RDF, GraphDB, Oxigraph, > Comunica) returned true, except one (Virtuoso). There's a SPARQL test for that: strlang03 Jena isn't consistent in all possible cases. That query doesn't touch storage which may make a difference. Other tests (also not storage) are: SELECT (count(DISTINCT *) AS ?C) { VALUES ?x {"a"@EN "a"@en } } SELECT ?lang_x ?lang_y { VALUES (?x ?y) {("a"@EN "a"@en)} BIND(LANG(?x) AS ?lang_x) BIND(LANG(?y) AS ?lang_y) ## maybe tests on ?lang_x and ?lang_y } And loading data: <x:s1> <x:p1> "abc"@en . <x:s2> <x:p1> "abc"@EN . SELECT ?s { ?s ?p ?z . FILTER(sameTerm(?z, "abc"@en)) } # Passed into the FILTER: SELECT ?s { ?s ?p ?z . VALUES ?C {"abc"@en} FILTER(sameTerm(?z, ?C)) } SELECT ?s { ?s ?p ?z . FILTER( ?z = "abc"@en ) } Jena returns 1 row for the first two because it is working on exact presentation in storage (it is a bug in the optimizer) and 2 rows for the second (because "=" includes "sameTerm"). I would like this all to go away :-) but there are lots of deployments nowadays and so any change can have an impact. The charter has something to same here (as did RDF 1.1 and SPARQL 1.1 charters) [*] >> We could produce a "best practice" note/document/... >> We could conduct a community survey. > +1 For both XSD and language tag (XSD cases being more common), I've been recommending canonicalizing on input to get consistent and explainable behaviour. Andy [*] "Compatibility means deliberately repeating other people's mistakes." https://quotepark.com/quotes/2112031-david-wheeler-computer-scientist-compatibility-means-deliberately-repeating-other-p/
Received on Friday, 28 July 2023 15:22:24 UTC