Re: Review of XSD Datatypes 1.1 Changes

Alex,

thank you! This is great. 

It guess it is worth making the changes in the concepts and the semantics documents asap, so that we do not forget these. 

Ivan

----
Ivan Herman
Tel:+31 641044153
http://www.ivan-herman.net



On 2 Feb 2012, at 23:08, Alex Hall <alexhall@revelytix.com> wrote:

> Per ACTION-136 - Review changes in W3C XML Schema Definition Language (XSD) -- http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#changes
> 
> I've completed my review of the changes in XSD Datatypes 1.1. Rather than go through the exhaustive list of changes, I'll summarize the areas that I think are relevant to RDF:
> 
> 1. Datatype definitions, including definitions of lexical spaces, value spaces, L2V mappings, and canonical mappings, underwent a thorough revision. This is a good thing, because the new definitions are much more precisely stated and leave less room for confusion. In general, RDF defers to XSD for datatype definitions so I don't think any action on our part is required here in terms of the RDF specs. However, implementors of XSD datatype processing in RDF will want to review these changes so we might want to call their attention to them. I did verify that the short-form literal definitions in Turtle for boolean, double, decimal, and integer are still valid subsets of the respective lexical spaces in XSD 1.1.
> 
> 2. XSD 1.1 distinguishes between the identity of values and the (numeric) equality of values. As far as I can tell, RDF Semantics is defined strictly in terms of identities (I would appreciate confirmation of this from one of the editors). To avoid confusion, it might be worth noting this distinction in the section on datatype entailment and explicitly stating that datatype entailment deals with identity and not equality, if that is indeed our position. [For SPARQL, pattern matching deals with identity and the '=' operator deals with equality.]
> 
> 3. The float and double datatypes introduce positive and negative zero to the value space; these values are distinct but equal. Conversely, NaN is identical to but not equal to itself. This does have implications for RDF (and SPARQL). For instance, take the statements:
> 
> <s> <p> "+0"^^xsd:double .
> <s> <p> "-0"^^xsd:double .
> 
> These two statements are equivalent under XSD entailment using the definition of double from XSD 1.0 (because "+0" and "-0" both mapped to the value zero), but are distinct under XSD entailment using the definition from XSD 1.1.
> 
> But, given a graph with these statements, the SPARQL query: SELECT * { <s> <p> ?o FILTER ( ?o = "0"^^xsd:double ) } should return two rows.
> 
> Meanwhile, given the graph:
> 
> <s> <p> "NaN"^^xsd:double .
> 
> SELECT * { ?s <p> "NaN"^^xsd:double } should return one row.
> SELECT * { <s> <p> ?o FILTER ( ?o = "NaN"^^xsd:double ) } should return zero rows.
> 
> 4. The value spaces of the primitive datatypes are disjoint. This is not actually a change in XSD 1.1, but is given more prominence (moved from Section 4, buried in the definition of the equality facet to Section 2 in the definition of the datatype system). So, strictly speaking, the graph { <s> <p> "1.0"^^xsd:decimal } does not XSD-entail the graph { <s> <p> "1.0"^^xsd:double } because decimal and double are different primitive types. This came as a surprise to me, even though I've spent some time poking around in the XSD specs, so I thought I'd call attention to it here. I had just presumed that the value denoted by both literals was simply the number 1.
> 
> 5. The definition of the xsd:duration datatype has been significantly revised. We should revisit the statement that "xsd:duration does not have a well-defined value space" and therefore should not be used in RDF. To begin with, I don't know what "well-defined" means in the context of this sentence. I do know that the confusion surrounding xsd:duration has to do with the fact that different months have different numbers of days, and the difficulty that arises when trying to compare a duration with a month component to one with (day/hour/minutes/seconds) components that total 28 days or more.
> 
> The duration definition in XSD 1.1 does have a clearly defined:
>    - lexical space, which is the same as that in 1.0
>    - value space, which is modeled as a [ months as xsd:integer, seconds as xsd:decimal ] tuple.
>    - identity condition: two durations are identical if and only if their months and seconds components are both identical.
>    - equality relation, which is the same as its identity relation.
>    - partial ordering.
> 
> Given these revisions, we should consider including xsd:duration in the list of RDF-compatible XSD types.
> 
> 6. We should include the following types, new in XSD 1.1, to the list of RDF-compatible XSD types:
>    - xsd:dateTimeStamp, derived from xsd:dateTime by requiring a timezone offset.
>    - xsd:dayTimeDuration, derived from xsd:duration by restricting the months component in the value space to be zero.
>    - xsd:yearMonth, derived from xsd:duration by restricting the seconds component in the value space to be zero.
> 
> Regardless of what is decided for xsd:duration, we should include dayTimeDuration and yearMonthDuration since both of these types are totally ordered.
> 
> Regards,
> Alex
> 

Received on Friday, 3 February 2012 05:54:58 UTC