ACTION-135 - Review changes in XSD

per ACTION-135 - Review changes in W3C XML Schema Definition Language (XSD) -- http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#changes

XML Schema 1.1 part 2 Appendix I act 3 psalm 2 has a list of high-level changes since the 1.0 Recommendation. I read them, summarized them below ("..." means it's a repeat from earlier in the list), and pulled out those which appear relevant to RDF.


The floats and doubles +0.0 and -0.0 are distinct (but equal for purposes of bounds checking).
Not an issue -- { <s> <p> "+0.0"^^xsd:float , "-0.0"^^xsd:float } is already 2 triples, and we don't invoke XML Schema bounds checking.


Primitive datatypes and facets are now extensible.
I don't think this is an issue as we don't reference the totality of datatypes with respect to conformance. However, I'm not certain that there's no impact on the semantics doc.

A leading sign, e.g. "+5", is now allowed on unsigned{Long,Int,Short,Byte}s.

dateTimes now allow time zone offsets.

1BCE is represented as the year 0, 2BCE as -1, etc.

examples of new valid literals:
  "+5"^^xsd:unsignedInt
  "NaN"^^xsd:float
  "+INF"^^xsd:float
  "+INF"^^xsd:double
  "2012-01-01T00:00-05:00"^^xsd:dateTime
  "0"^^xsd:year
So now everyone who has these once-malformed literals kicking around can now joyously share them with the world.


Below is my accounting of each entry in the listed changes:

== I.1 Datatypes and Facets ==

    new datatype named anyAtomicType serves as the base type definition for all primitive atomic datatypes.

    The treatment of datatypes in Datatype System (ยง2) made more precise and explicit.

+0.0 and -0.0 distinct (but equal for purposes of bounds checking)
ericP: { <s> <p> "+0.0"^^xsd:float , "-0.0"^^xsd:float } is already 2 triples

    bounded = false for list datatypes is now always false

    + Units on length facet

<http://www.w3.org/2001/XMLSchema-datatypes> deprecated
ericP: we use <http://www.w3.org/2001/XMLSchema#>

    + assertions facet, e.g. <assertion test='$value ne 0'/>

    of when and how to collapse whitespace

primitive datatypes and facets are now extensible
ericP: we don't reference the totality of datatypes with respect to conformance.
ericP: impact on semantics doc?

== I.2 Numerical Datatypes ==

...+0.0 and -0.0 distinct

lexical spaces of unsignedLong, unsignedInt, unsignedShort, and unsignedByte allow leading '+'
ericP: { <s> <p> "+5"^^xsd:unsignedInt } now has a value.

...+0.0 and -0.0 distinct
  "NaN"^^xsd:float != "NaN"^^xsd:float

+ "+INF"^^xsd:float and "+INF"^^xsd:double

== I.3 Date/time Datatypes ==

+ "2012-01-01T00:00-05:00"^^xsd:dateTime ("2012-01-01T00:00"^^xsd:dateTime still valid)

    + explicitTimezone facet (per request from OWL), used for ^^xsd:dateTimeStamp

    order defined for repeating datatypes, e.g.time, gDay. only in Z do days do not run from 00:00:00Z to 24:00:00Z.

+ "0"^^xsd:year (which == -1BCE)

    + dateTime and duration algorithms.
    ~ timeOnTimeline corrected.

    - leap-seconds from value space.

    s/time zone/time zone offset/

    lexical constraint regexps corrected.

    + regexps include "24:00:00"

=== from <http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#dateTime> ===
~ clarified leap years

== I.4 Other changes == 

    + something about datatypes may depend on XML 1.1 and XML Namespaces 1.1???.

    ~ normative refs allow for evolution of ref'd spec, e.g. migration from XML 4th edition to XML 5th edition.

    ~ unicode ref now 5.1.0

    ~ other refs updated

~ the defined value space of duration was simplified from (years, months, days, hours, minutes, seconds) to (months, seconds).
ericP: the lexical space and the semantics remain the same.

    + two new restrictions on duration: yearMonthDuration and dayTimeDuration, alignment with XPath durations.

    ~ Illustrative XML representations isolated in their own appendix

    ~ minor corrections in response to comments

    ~ schema parts 1 and 2 better aligned.

    ...other refs updated

    ~ clarified datatype-validity on type "language".

    + some new definitions for lexical and canonical primitive datatypes.

    ~ restrict NOTATION to validate literals

    ~ regexp notation corrected.

    ~ something about combining pattern and enumeration facets.

    + warning against using the whitespace facet for tokenizing natural-language data.

    - unions are no longer forbidden to be members of other unions (affecting transitive membership)

    ~ conformance distinguishes between implementation-defined and implementation-dependent
    + composition with host languages requirements defined.

    + processors must detect and report errors in schemas and schema documents.

    ~ scope of QName namespaces clarified.

    ~ clarified which lexical mappings define functions from value to lexical space.

    + clarified nature of equality and identity of lists.

    + +0.0 and -0.0 allowed in keys, keyrefs and enumerations.

    ~ clarified which datatypes may appear as list or union members.

    + empty unions allowed.

    + simple type and union derivations acyclic.

 ...~ minor edits


-- 
-ericP

Received on Wednesday, 1 February 2012 05:19:15 UTC