- From: Andy Seaborne <andy@apache.org>
- Date: Fri, 28 Jul 2023 16:22:16 +0100
- To: public-rdf-star-wg@w3.org
On 28/07/2023 15:06, Pierre-Antoine Champin wrote:
>
> On 27/07/2023 13:15, Andy Seaborne wrote:
>> On 27/07/2023 10:37, Pierre-Antoine Champin wrote:
>>>
>>> On 21/07/2023 21:59, Peter F. Patel-Schneider wrote:
>>>> As far as I can tell,
>>>>
>>>> :a :h "x"@EN {| :accordingTo :e |} .
>>>>
>>>> does not entail
>>>>
>>>> :a :h "x"@en {| :accordingTo :e |} .
>>>>
>>>> in the community group semantics, even if the underlying semantics
>>>> is the RDFS semantics.
>>
>> This is covered in D-entailment.
>>
>> https://www.w3.org/TR/rdf12-semantics/#D_interpretations
>>
>> so it is similar to the case of "5"^^xsd:integer and "05"^^xsd:integer.
>
> Since literals in quoted triples are opaque in the CG report,
> D-entailment does not "fix", as illustrated in Example 38:
>
> https://www.w3.org/2021/12/rdf-star.html#ref-opacity-annotation
Right - it is the same situation as the numeric one previously mentioned
in the CG. I'm not claiming it fixes it (it doesn't!).
>> The difficulty I have is why deal with language tags one way and XSD
>> numbers another way. RDF Concepts, which mentions "Core types"
>> xsd:decimal and xsd:integer.
...
> Note that this does not prevent implementation to preserve the original
> case (e.g. "en-US") to respect users preferences.
When retrieving from storage you don't know if the data is in the
results, or if it's going to be tested, or both. Examples below.
>>> Gregg's PR #48 on rdf12-concepts fixes this [1] by making the
>>> conversion to lower case part of the comparison for term equality.
>>
>> A change here is one that will affect existing stored data. But if we
>> could solve this once and for all, that would be good.
>
> Will it, though? I ran the following SPARQL queries on a number of
> implementations :
>
> SELECT (sameTerm("a"@en, "a"@EN) as ?test) {}
>
> and all of them (Jena, RDFlib (python), Ruby RDF, GraphDB, Oxigraph,
> Comunica) returned true, except one (Virtuoso).
There's a SPARQL test for that: strlang03
Jena isn't consistent in all possible cases.
That query doesn't touch storage which may make a difference.
Other tests (also not storage) are:
SELECT (count(DISTINCT *) AS ?C) { VALUES ?x {"a"@EN "a"@en } }
SELECT ?lang_x ?lang_y {
VALUES (?x ?y) {("a"@EN "a"@en)}
BIND(LANG(?x) AS ?lang_x)
BIND(LANG(?y) AS ?lang_y)
## maybe tests on ?lang_x and ?lang_y
}
And loading data:
<x:s1> <x:p1> "abc"@en .
<x:s2> <x:p1> "abc"@EN .
SELECT ?s { ?s ?p ?z . FILTER(sameTerm(?z, "abc"@en)) }
# Passed into the FILTER:
SELECT ?s {
?s ?p ?z .
VALUES ?C {"abc"@en}
FILTER(sameTerm(?z, ?C))
}
SELECT ?s { ?s ?p ?z . FILTER( ?z = "abc"@en ) }
Jena returns 1 row for the first two because it is working on exact
presentation in storage (it is a bug in the optimizer) and 2 rows for
the second (because "=" includes "sameTerm").
I would like this all to go away :-) but there are lots of deployments
nowadays and so any change can have an impact.
The charter has something to same here (as did RDF 1.1 and SPARQL 1.1
charters) [*]
>> We could produce a "best practice" note/document/...
>> We could conduct a community survey.
> +1
For both XSD and language tag (XSD cases being more common), I've been
recommending canonicalizing on input to get consistent and explainable
behaviour.
Andy
[*]
"Compatibility means deliberately repeating other people's mistakes."
https://quotepark.com/quotes/2112031-david-wheeler-computer-scientist-compatibility-means-deliberately-repeating-other-p/
Received on Friday, 28 July 2023 15:22:24 UTC