Lexical and canonical representations of dateTime, et al.

Hello,

     I had some questions/comments about the lexical representation of
dateTime in the latest Schema Datatypes PR.

     Section 3.2.7.1 of the PR (http://www.w3.org/TR/xmlschema-2/#dateTime)
states that

   This lexical representation is the [ISO 8601] extended format
   CCYY-MM-DDThh:mm:ss where "CC" represents the century, "YY" the year,
   "MM" the month and "DD" the day, preceded by an optional leading "-"
   sign to indicate a negative number. If the sign is omitted, "+" is
   assumed. The letter "T" is the date/time separator and "hh", "mm",
   "ss" represent hour, minute and second respectively. Additional
   digits can be used to increase the precision of fractional seconds
   if desired i.e the format ss.ss... with any number of digits after
   the decimal point is supported. To accommodate year values greater
   than 9999 additional digits can be added to the left of this
   representation. The year 0000 is prohibited.

   This representation may be immediately followed by a "Z" to indicate
   Coordinated Universal Time (UTC) or, to indicate the time zone,
   i.e. the difference between the local time and Coordinated Universal
   Time, immediately followed by a sign, + or -, followed by the
   difference from UTC represented as hh:mm.

1) Unlike the definition of number (3.2.3), this definition doesn't specify
the minimum number of additional year digits nor the minimum number of
additional digits in the fractional portion of the seconds that needs to be
supported by a processor.  Does a processor really need to be prepared to
handle an arbitrary number of digits?  Obviously this can have a
significant effect on an implementation.

2) Is the ":mm" portion of the timezone required in the lexical
representation?  For example, is 2001-03-19T10:20:00-05 a permissible
lexical representation?  The second paragraph quoted above seems to imply
that it is required, but some of the examples show only the hours portion
of the difference from UTC when ":mm" is ":00".  If the ":mm" can be
omitted, is it required in the canonical representation, or must it be
omitted from the canonical representation when ":mm" is ":00"?

3) ISO 8601 specifies that 24:00:00 of one day is the same as 00:00:00 of
the following day.  Which is the permitted form in the canonical
representations of the various types?

4) Are leading zero digits in a year permitted in the lexical
representation beyond the four required digits?  For example,
0012001-03-19T10:20:00.  I didn't notice any restriction against that.  If
that would be permitted, should it be restricted in the canonical
representation?  Sorry if this is described in the revision of ISO 8601; I
don't have a copy.

Thanks,

Henry
------------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 778-6044;  Phone (416) 448-6044
mailto:zongaro@ca.ibm.com

Received on Tuesday, 20 March 2001 11:58:23 UTC