- From: Birte Glimm <birte.glimm@uni-ulm.de>
- Date: Fri, 27 Mar 2015 15:02:53 +0100
- To: Markus Kroetzsch <markus.kroetzsch@tu-dresden.de>
- Cc: Andy Seaborne <andy@apache.org>, SPARQL <public-sparql-dev@w3.org>
Nice summary to which I can agree. Cheers, Birte On 27 March 2015 at 14:53, Markus Kroetzsch <markus.kroetzsch@tu-dresden.de> wrote: > Hi Andy, hi Birte, > > Thanks for the swift replies. I will carefully try to consolidate your > answers to a consistent view (which I hope you will agree with). You said > two important things: > >> The D-Entailment Regime explicitly refers to XSD 1.1. > > That's true, and this is a normative reference in the normative part of the > specification [1]. Therefore, it seems clear that SPARQL 1.1 implementations > that support datatype semantics should accept year "0000" and understand it > as 1 BC. > >> SPARQL refers to XSD Schema 1.0 > > This is also true, and again the reference is normative [2]. It seems from > the sentence where this reference is used that the IRI xsd:dateTime refers > to the datatype of XSD 1.0, but Andy offered an alternative reading: > "because XSD 1.1 does not change the URI for datatypes or functions, it's > sort of an "upgrade in place"." In any case, this section only refers to the > meaning of literals when used as operands in FILTER functions/operators. > > > Some things are immediately clear from these observations. In particular, no > SPARQL 1.1 processor should ever reject the year "0000" in the input data or > BGP. Either the processor uses simple entailment (then "0000" is just a > string) or the processor supports D-entailment (then "0000" must be > interpreted as per XSD 1.1). This is reassuring. > > In the case of FILTERs, there seems to be some leeway for interpretations. > In either case, there is no contradiction with D-entailment, since > D-entailment is only about BGPs while the XSD 1.0 reference is only used in > a section about FILTERs. > > I would be in favour of adopting the view that Andy has proposed, namely > that the meaning of IRIs has been "upgraded in place" when XSD 1.1 became a > standard. If this interpretation is not used, one would get a very weird > conforming behaviour, where the following two queries would have the same > answers: > > SELECT * WHERE { > ?S ?P "0000-01-01T00:00:00Z"^^xsd:dateTime . > } > > SELECT * WHERE { > ?S ?P ?X FILTER( ?X = xsd:dateTime("-0001-01-01T00:00:00Z") ) > } > > Note that this would be true even for processors that do not support > D-entailment, as long as they support the xsd:dateTime FILTER at all. > Clearly, this is not something we would want, so that the only sane > interpretation of xsd:dateTime in SPARQL 1.1 would be to use XSD 1.1. > Fortunately, this is also the interpretation that current RDF and OWL > standards require, so that the meaning of year "0000" in the input file is > the same as in the query. I hope others agree. > > Note there is a non-technical dimension to all of this: if a SPARQL endpoint > or LOD service returns data on the web, consuming applications must know > what it means (not to calculate durations or to check validity -- but > already to correctly display the data to their users in a non-technical > syntax). The point of view that XSD literals are "just strings" may work for > a DBMS implementer, but as a user of the technology you have to decide how > to encode and query your content, i.e., you must know how "1 BCE" is > represented. People who are building applications based on RDF and SPARQL > therefore must make this decision, and I can only see them going with the > most recent XSD, RDF, and OWL standards -- it's great to know that SPARQL > 1.1 agrees with those, even if one has to do some interpretation to > recognize this ;-) > > Cheers, > > Markus > > > [1] http://www.w3.org/TR/sparql11-entailment/#DEntRegime > [2] http://www.w3.org/TR/sparql11-query/#operandDataTypes > > > > On 27.03.2015 12:54, Andy Seaborne wrote: >> >> The root change is in: ISO 8601:2000 Second Edition >> where year "0000" went from illegal to 1 BCE. >> >> Yes - I can see that's a genuine problem for wikidata. >> >> Two answers: spec effect and implementation reality. >> >> 1/ Spec answer. >> >> For just plain retrieval of data, SPARQL returns RDFterms, not related >> to their legality or value so it's the form that is returned, >> "-0001-02-03T12:11:10+00:00"^^xsd:dateTime. >> >> Or even >> "0000-02-03T12:11:10+00:00"^^xsd:dateTime >> >> If used in a FILTER, the value then does matter. >> >> SPARQL only formally requires xsd:dateTime, not xsd:date, and even then >> a limited subset of oeprations; comparison but not subtraction. Many >> implementations include xsd:date as well. >> >> I can see two ways of observing the change: >> >> A/ If there illegal lexical forms, the year "0000" was illegal and >> became legal, and a FILTER may go from being an error to returning true >> or false. This happens if the data has year "0000" or the FILTER >> mentions it explicitly. >> >> A FILTER expression evaluates to an error is effectively false overall >> anyway. >> >> # Different days, year 0000 >> FILTER ( >> "0000-02-03T12:11:10+00:00"^^xsd:dateTime >= >> "0000-02-02T12:11:10+00:00"^^xsd:dateTime ) >> >> changes from filter error, do not return the row, to true. >> >> but comparison around the boundary is not changed. It is the mentioning >> of 0000, explicitly or in the data, that is the problem. >> >> B/ As an extension, xsd:duration may be supported. >> >> # Across BCE/CE boundary: >> BIND("-0001-02-03T12:11:10+00:00"^^xsd:dateTime AS ?d1) >> BIND("0001-02-03T12:11:10+00:00"^^xsd:dateTime AS ?d2) >> BIND(?d2 - ?d1 AS ?duration) >> >> SPARQL refers to XSD Schema 1.0 but the effect of extensions is >> implementation. Functions are named by URI and because XSD 1.1 does not >> change the URI for datatypes or functions, it's sort of an "upgrade in >> place". >> >> So specification wise, there is an impact, it's confused by the >> change-in-place of XSD URIs. >> >> > What does "-0001-02-03"^^xsd:date mean? >> >> When that is the RDFterm returned, it's up to the application. >> When it's used in a FILTER, it's exposed to the change. >> Extensions to the core spec are impacted. >> >> 2/ Implementation answer: >> >> Implementations may rely on a 3rd party library to do the parsing and >> calculation and it will whatever that library does. >> >> For example, Jena uses Apache Xerces for parsing and the Java runtime, >> which provides XMLGregorianCalendar which is W3C XML Schema 1.0 (Java8 >> and Java9), for calculation of durations. >> >> Andy >> >> On 27/03/15 09:56, Markus Kroetzsch wrote: >>> >>> Dear all, especially former members of the SPARQL WG, >>> >>> As you might know, the Wikimedia Foundation is currently working on >>> setting up an official public SPARQL service for Wikidata. This was done >>> not to integrate with RDF or to add to the semantic web, but simply >>> because it seems to be the best technology for the query problem at >>> hand. I think this should be considered a success :-) You are also >>> welcome to play around with the preliminary test SPARQL endpoint of >>> Wikidata, see [0], and of course to comment on the wikidata-l list >>> regarding nice SPARQL queries or other ideas. >>> >>> However, on the way to making this a reality as a fully integrated >>> feature of Wikidata/Wikipedia, there are many issues to be solved. One >>> that came up recently is about xsd:date(Time) in SPARQL 1.1. As you will >>> know, XML Schema has changed the semantics of its date types in >>> incompatible ways between XSD 1.0 and XSD 1.1: >>> >>> * XSD 1.1: "-0001-02-03"^^xsd:date means "3rd Feb 2 BCE" [1] >>> * XSD 1.0: "-0001-02-03"^^xsd:date means "3rd Feb 1 BCE" [2] >>> >>> Needless to say that this is a big deal in applications like Wikidata, >>> where you have a lot of historical dates. The obvious question now is: >>> What does "-0001-02-03"^^xsd:date mean when used in SPARQL? RDF? OWL? >>> Here is what I have found so far: >>> >>> * RDF 1.0: year 1 BCE >>> * OWL 1: year 1 BCE >>> * SPARQL 1.0: year 1 BCE >>> (all as expected) >>> >>> * RDF 1.1: year 2 BCE [3] >>> * OWL 2: year 2 BCE [4] >>> * SPARQL 1.1: ??? >>> >>> It is interesting to note that the semantic changes in XSD, RDF and OWL >>> each are breaking changes, which change the meaning of existing >>> documents (where the document itself may not contain any hint as to >>> whether it was created before or after the change). >>> >>> I am not sure what is the case for SPARQL 1.1. It seems very much >>> preferable if SPARQL would follow the other W3C standards in this >>> matter, but I did not find out yet what was the intention of the SPARQL >>> WG. All comments are welcome, but in the end we are looking for a >>> normative answer here. >>> >>> Best regards, >>> >>> Markus >>> >>> >>> [0] >>> https://www.mail-archive.com/wikidata-l@lists.wikimedia.org/msg05601.html >>> (gives >>> you the Wikidata endpoint URL, but more importantly also example queries >>> for our current RDF translation, which we are currently revising in >>> several places) >>> [1] http://www.w3.org/TR/xmlschema11-2/#dateTime >>> [2] http://www.w3.org/TR/xmlschema-2/#dateTime >>> [3] http://www.w3.org/TR/rdf11-concepts/#section-Datatypes >>> [4] http://www.w3.org/TR/owl2-syntax/#Datatype_Maps >>> >> >> >> > > -- > Markus Kroetzsch > Faculty of Computer Science > Technische Universität Dresden > +49 351 463 38486 > http://korrekt.org/ > -- Jun. Prof. Dr. Birte Glimm Tel.: +49 731 50 24125 Inst. of Artificial Intelligence Secr: +49 731 50 24258 University of Ulm Fax: +49 731 50 24188 D-89069 Ulm birte.glimm@uni-ulm.de Germany
Received on Friday, 27 March 2015 14:03:25 UTC