- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Mon, 14 Sep 2009 09:40:08 +0100
- To: Ivan Herman <ivan@w3.org>
- Cc: "Seaborne, Andy" <andy.seaborne@hp.com>, Axel Polleres <axel.polleres@deri.org>, W3C SPARQL Working Group <public-rdf-dawg@w3.org>
On 14 Sep 2009, at 05:58, Ivan Herman wrote: > Andy, > > Here is a concrete example. Say our data is: > > <rdf:RDF xmlnsrdf="..." xmlns:ex="..."> > <rdf:Description rdf:about=""> > <ex:p rdf:parseType="Literal"> > <ex:bla1 a="something" q="and" b="something else" /> > </ex:p> > </rdf:Description> > </rdf:RDF> > > My question is: what is the result of > > PREFIX ex: <...> > ASK WHERE { > ?a ex:p > "<ex:bla1 q="and" > b="something else" a="something"/>^^rdf:Literal . > } > > My feeling is that the answer should be 'true', regardless of the > fact that the two literals are different in the order of the > attributes and the usage of white spaces. Since comparisons are normally in "term" space, i.e., lexical space, my feeling is different. > The RDF/XML spec explicitly says that, in the case above, the XML > part is transformed into the 'correct' lexical form when creating > the abstract RDF triple (which is defined in the term of > canonicalized XML). That seems to be a bug in RDF/XML, frankly. The lexical space of XMLLiteral is *not* the canonicalized form and I don't see why the parse phase should say anything about it. (Do systems generally adhere to this part of the spec?) No other datatype, to my knowledge, *requires* canonicalization (though XML Schema 1.1 provides for a canonicalization for all of them, I believe). http://www.w3.org/TR/2003/WD-rdf-concepts-20030123/#dfn-rdf-XMLLiteral """The lexical spacecontains all pairs ( string, lang ) where lang is any language identifier [RFC-3066] in lowercase, and string is well- balanced, self-contained XML element content [XML], for which the XML document corresponding to the pair is a well-formed XML document [XML] that also conforms to XML Namespaces [XML-NS].""" But even if you buy that coming from RDF/XML you'll end up with canonicalized lexical forms, not every source must do that. AFAICT, SPARQL is silent on canonicalization...XMLLiteral is just another datatyped literal. So those would definitely not match. > Does the SPARQL spec says the same? > > Note that this is _not_ the case as if we replaced the two literals > with, say, 1.0 and 1.00 declaring both to be floats. The way XML > Literal is currently defined is such that the lexical form (not the > value space!) is the canonical XML version. This is false. See above. If it were true, then semantically the first graph would have a not- well-formed literal, thus, semantically, would not be an instance of rdfs:Literal. > Ie, by referring to the fact that the comparison of literal should > be done in the value space does not cover the XML Literal case. ? Er...you mean that the comparison should be done in the lexical space cuts no ice? But surely it does :) How about errata on RDF XML syntax and RDF concepts to change, in the former case, the parsing to simply check for well formedness (with namespaces) and the latter to make the value and the lexical spaces identical. We can then add functions such as "equalUnderCanonicalization" which would apply to *any* datatype, I know your default reply ("it's impossible") but is it really? Does anyone really think these things *aren't* bugs (at the very least, the generality of the lexical form is in tension with the strictness of the parsing spec)? Plus, it *removes* code. Cheers, Bijan. Cheers, Bijan.
Received on Monday, 14 September 2009 08:40:50 UTC