W3C home > Mailing lists > Public > public-rdf-wg@w3.org > February 2012

Re: ACTION-135 - Review changes in XSD

From: Alex Hall <alexhall@revelytix.com>
Date: Thu, 2 Feb 2012 13:43:07 -0500
Message-ID: <CAFq2bixvARZJ6kLyvF+LQZFXV7HP6QDZgwWDFMi2dZsivHCcCA@mail.gmail.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
On Wed, Feb 1, 2012 at 6:57 AM, Andy Seaborne <andy.seaborne@epimorphics.com
> wrote:

> I don't see the changes mentioning xsd:decimal.
>
> The XSD decimal canonical form says: [2]
>
> [[
> Specifically, for integers, the decimal point and fractional part are
> prohibited. For other values, the preceding optional "+" sign is
> prohibited.  The decimal point is required.  In all cases, leading and
> trailing zeroes are prohibited subject to the following:  there must be at
> least one digit to the right and to the left of the decimal point which may
> be a zero.
> ]]
>
> "For integers, the decimal point ... are prohibited."
>
> I take it that it means "integer value" here, not literal or type
> xsd:integer because it's in the decimal definition and because:
>
> [[
> The mapping from values to ·canonical representations· is given formally
> in ·decimalCanonicalMap·.
> ]]
>
> decimalCanonicalMap says:
>
> [[
>    If d is an integer, then return ·noDecimalPtCanonicalMap·(d).
>    Otherwise, return ·decimalPtCanonicalMap·(d).
> ]]
>
> so, I think, the recommendation is to have it decimal-pointless
> integer-valued decimals.  This is good - same as the integer canonical form.
>
> This is a change from 1.0 where the text only has "The decimal point is
> required." [2] and integer canonical form was not decimal canonicasl for
> for the same value.
>
> This relates to Turtle short forms - we have already chosen to make "8." a
> decimal, not integer+DOT.
>

We have? I recall discussions on that topic but I don't remember how that
was resolved.

The grammar in the editor's draft [1] says:

[35]   <DECIMAL>   ::=   ([0-9])+ "." ([0-9])+ | "." ([0-9])+

A decimal is required to have at least one digit after the "." so "8." must
be integer+DOT, not decimal, under this grammar.


> We do not rely on decimal canonicalisation for Turtle but if we encourage
> value-based systems, then this might be relevant.
>

Are you suggesting we align the syntax for Turtle short forms with XSD
canonical forms?

-Alex

[1]
https://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#sec-grammar


>
>        Andy
>
> [1] 3.3.3.1
> http://www.w3.org/TR/2012/PR-**xmlschema11-2-20120119/#**decimal<http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#decimal>
>
> [2] 3.2.3.2
> http://www.w3.org/TR/**xmlschema-2/#decimal<http://www.w3.org/TR/xmlschema-2/#decimal>
>
>
>
> On 01/02/12 05:18, Eric Prud'hommeaux wrote:
>
>> per ACTION-135 - Review changes in W3C XML Schema Definition Language
>> (XSD) -- http://www.w3.org/TR/2012/PR-**xmlschema11-2-20120119/#**changes<http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#changes>
>>
>> XML Schema 1.1 part 2 Appendix I act 3 psalm 2 has a list of high-level
>> changes since the 1.0 Recommendation. I read them, summarized them below
>> ("..." means it's a repeat from earlier in the list), and pulled out those
>> which appear relevant to RDF.
>>
>>
>> The floats and doubles +0.0 and -0.0 are distinct (but equal for purposes
>> of bounds checking).
>> Not an issue -- {<s>  <p>  "+0.0"^^xsd:float , "-0.0"^^xsd:float } is
>> already 2 triples, and we don't invoke XML Schema bounds checking.
>>
>>
>> Primitive datatypes and facets are now extensible.
>> I don't think this is an issue as we don't reference the totality of
>> datatypes with respect to conformance. However, I'm not certain that
>> there's no impact on the semantics doc.
>>
>> A leading sign, e.g. "+5", is now allowed on unsigned{Long,Int,Short,Byte}
>> **s.
>>
>> dateTimes now allow time zone offsets.
>>
>> 1BCE is represented as the year 0, 2BCE as -1, etc.
>>
>> examples of new valid literals:
>>   "+5"^^xsd:unsignedInt
>>   "NaN"^^xsd:float
>>   "+INF"^^xsd:float
>>   "+INF"^^xsd:double
>>   "2012-01-01T00:00-05:00"^^xsd:**dateTime
>>   "0"^^xsd:year
>> So now everyone who has these once-malformed literals kicking around can
>> now joyously share them with the world.
>>
>>
>> Below is my accounting of each entry in the listed changes:
>>
>> == I.1 Datatypes and Facets ==
>>
>>     new datatype named anyAtomicType serves as the base type definition
>> for all primitive atomic datatypes.
>>
>>     The treatment of datatypes in Datatype System (§2) made more precise
>> and explicit.
>>
>> +0.0 and -0.0 distinct (but equal for purposes of bounds checking)
>> ericP: {<s>  <p>  "+0.0"^^xsd:float , "-0.0"^^xsd:float } is already 2
>> triples
>>
>>     bounded = false for list datatypes is now always false
>>
>>     + Units on length facet
>>
>> <http://www.w3.org/2001/**XMLSchema-datatypes<http://www.w3.org/2001/XMLSchema-datatypes>>
>>  deprecated
>> ericP: we use<http://www.w3.org/2001/**XMLSchema#<http://www.w3.org/2001/XMLSchema#>
>> >
>>
>>     + assertions facet, e.g.<assertion test='$value ne 0'/>
>>
>>     of when and how to collapse whitespace
>>
>> primitive datatypes and facets are now extensible
>> ericP: we don't reference the totality of datatypes with respect to
>> conformance.
>> ericP: impact on semantics doc?
>>
>> == I.2 Numerical Datatypes ==
>>
>> ...+0.0 and -0.0 distinct
>>
>> lexical spaces of unsignedLong, unsignedInt, unsignedShort, and
>> unsignedByte allow leading '+'
>> ericP: {<s>  <p>  "+5"^^xsd:unsignedInt } now has a value.
>>
>> ...+0.0 and -0.0 distinct
>>   "NaN"^^xsd:float != "NaN"^^xsd:float
>>
>> + "+INF"^^xsd:float and "+INF"^^xsd:double
>>
>> == I.3 Date/time Datatypes ==
>>
>> + "2012-01-01T00:00-05:00"^^xsd:**dateTime ("2012-01-01T00:00"^^xsd:**dateTime
>> still valid)
>>
>>     + explicitTimezone facet (per request from OWL), used for
>> ^^xsd:dateTimeStamp
>>
>>     order defined for repeating datatypes, e.g.time, gDay. only in Z do
>> days do not run from 00:00:00Z to 24:00:00Z.
>>
>> + "0"^^xsd:year (which == -1BCE)
>>
>>     + dateTime and duration algorithms.
>>     ~ timeOnTimeline corrected.
>>
>>     - leap-seconds from value space.
>>
>>     s/time zone/time zone offset/
>>
>>     lexical constraint regexps corrected.
>>
>>     + regexps include "24:00:00"
>>
>> === from<http://www.w3.org/TR/**2012/PR-xmlschema11-2-**
>> 20120119/#dateTime<http://www.w3.org/TR/2012/PR-xmlschema11-2-20120119/#dateTime>>
>>  ===
>> ~ clarified leap years
>>
>> == I.4 Other changes ==
>>
>>     + something about datatypes may depend on XML 1.1 and XML Namespaces
>> 1.1???.
>>
>>     ~ normative refs allow for evolution of ref'd spec, e.g. migration
>> from XML 4th edition to XML 5th edition.
>>
>>     ~ unicode ref now 5.1.0
>>
>>     ~ other refs updated
>>
>> ~ the defined value space of duration was simplified from (years, months,
>> days, hours, minutes, seconds) to (months, seconds).
>> ericP: the lexical space and the semantics remain the same.
>>
>>     + two new restrictions on duration: yearMonthDuration and
>> dayTimeDuration, alignment with XPath durations.
>>
>>     ~ Illustrative XML representations isolated in their own appendix
>>
>>     ~ minor corrections in response to comments
>>
>>     ~ schema parts 1 and 2 better aligned.
>>
>>     ...other refs updated
>>
>>     ~ clarified datatype-validity on type "language".
>>
>>     + some new definitions for lexical and canonical primitive datatypes.
>>
>>     ~ restrict NOTATION to validate literals
>>
>>     ~ regexp notation corrected.
>>
>>     ~ something about combining pattern and enumeration facets.
>>
>>     + warning against using the whitespace facet for tokenizing
>> natural-language data.
>>
>>     - unions are no longer forbidden to be members of other unions
>> (affecting transitive membership)
>>
>>     ~ conformance distinguishes between implementation-defined and
>> implementation-dependent
>>     + composition with host languages requirements defined.
>>
>>     + processors must detect and report errors in schemas and schema
>> documents.
>>
>>     ~ scope of QName namespaces clarified.
>>
>>     ~ clarified which lexical mappings define functions from value to
>> lexical space.
>>
>>     + clarified nature of equality and identity of lists.
>>
>>     + +0.0 and -0.0 allowed in keys, keyrefs and enumerations.
>>
>>     ~ clarified which datatypes may appear as list or union members.
>>
>>     + empty unions allowed.
>>
>>     + simple type and union derivations acyclic.
>>
>>  ...~ minor edits
>>
>>
>>
>
Received on Thursday, 2 February 2012 18:44:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:03 UTC