Lexical and canonical representations of dateTime, et al.

Hello,

     I raised the following questions in a note on xmlschema-dev in 
March.[1]  Questions 2) and 4) in that note were addressed in the "XML 
Schema: Datatypes" Recommendation, but I don't believe that 1) and 3) were 
addressed, so I've copied them here so they won't be forgotten.

     Comments from Ashok Malhotra are prefixed by "AM>>" and responses 
from me are prefixed by "HZ>".



     Sections 3.2.7.1 and 3.2.7.2 of the Datatypes Recommendation define 
the lexical and canonical representations of the dateTime datatype, 
respectively.  Section 3.2.7.1 states, in part that:

Additional digits can be used to increase the precision of fractional 
seconds if desired i.e the format ss.ss... with any number of digits after 
the decimal point is supported. To accommodate year values greater than 
9999 additional digits can be added to the left of this representation. 


1) Unlike the definition of decimal (3.2.3), this definition doesn't
specify the minimum number of additional year digits nor the minimum
number of additional digits in the fractional portion of the seconds
that needs to be supported by a processor.  Does a processor really
need to be prepared to handle an arbitrary number of digits?
Obviously this can have a significant effect on an implementation.
AM>> There have been a lot of diffrent requirements for this.
AM>> Scientists want very accurate fractional second values.
AM>> Use a decimal number to represent the seconds part.

HZ>> I don't object to supporting very accurate fractional numbers of
HZ>> seconds; my only question is whether a processor needs to be
HZ>> prepared to support an *arbitrary* number of digits.  The
HZ>> definition of "number" permits a minimally-conforming processor
HZ>> to support as few as 18 digits, but there is no similar "out" for
HZ>> a processor with respect to the number of digits in the
HZ>> fractional portion of the seconds, nor in the number of digits in
HZ>> the year.

3) ISO 8601 specifies that 24:00:00 of one day is the same as 00:00:00
of the following day.  Which is the permitted form in the canonical
representations of the various types?
AM>> Both are acceptable.

HZ>> The definition of canonical lexical representation requires there
HZ>> to be a one-to-one mapping between the canonical lexical space
HZ>> and the value space.  Because 2001-03-21T24:00:00Z maps to the
HZ>> same value as 2001-03-22T00:00:00Z, I don't believe they can both
HZ>> be permitted to be canonical lexical values.

Thanks,

Henry
[1] http://lists.w3.org/Archives/Public/xmlschema-dev/2001Mar/0111.html
------------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:zongaro@ca.ibm.com

Received on Friday, 2 November 2001 11:56:18 UTC