Re: XML Schema Datatypes comments

Paul:
Sorry to take so long to respond to your comments on the datatypes spec.
As you will see, we have adopted most of your suggestions.
Please see my comments prefixed by AM>> in your note below.

All the best, Ashok


Paul Cotton@IBMCA
03/09/2000 07:54 PM

To:   www-xml-schema-comments@w3.org
cc:    (bcc: Ashok Malhotra/Watson/IBM)
From: Paul Cotton/Toronto/IBM@IBMCA
Subject:  XML Schema Datatypes comments


I have reviewed the XML Schema Datatypes specification and discussed my
comments with Paul Biron (at XTech 2000).  I have provided Paul with my
editorial nits by providing him with a hard copy of my marked up document.

The following are my non-editorial comments:

1. Section 2.4.1.2 Order
This section states "In such cases each datatype will define a different
order relation on the value space".  I do not understand why this must be
done.  Certainly at worst is should say "may define".  Better even would be
to delete the sentence entirely.
AM>>  Sentence deleted.

2. Section 2.4.2.5 enumeration
This section states "No order or any other relationship is implied ...".
This seems to imply that enumerations are not ordered.  I think this
sentence needs to be reworded to imply that "No further ordering is
implied" since certainly the ordering of the underlying data type must be
inherited.  If not then XML Query will have no means of ordering
enumerations.
AM>> Section has been reworded to reflect the intent above.

3. Section 3.2.1 string
This section states "The ordered property of string is the Unicode
character number sequence." The string data type is the only primitive
datatype that makes an explicit statement about how the ordering relation
(not property) is defined.  I expect the ordering information is missing
from other primitive datatype sections.
AM>> We've added order relations for the other primitive datatypes.

4. Section 3.2.1 string
This section states "The ordered property of string is the Unicode
character number sequence."  I wonder why the definition of the string
datatype does not permit a user to define the "collation" to be used?
"Unicode character number sequence" is only one "collation" and is not very
useful.  In addition the specification does not explain why this
"collation" is needed.

XML Query will need to support different collations for the string data
type.  It would be preferable if the collation was defined as part of the
<data type> not as part of the query <predicate>s.  I would recommend you
consider a solution such as one adopted by SQL to permit the type definer
to simply name the collation to be used.  No exact definition of the action
collation needs to be provide since there are several other sources for
this information.

AM>> Collation is needed to enable max/min on strings.  The WG discussed
user-defined collations but
AM>> decide not to do anything about this in V1.   My personal viewpoint is
that, except for the min/max case,
AM>> schema never concerns itself with the relation between 2 strings and
so this is not a schema problem.
AM>> Others disagree with this position but, regardless, we will not add
anything in V!.


5. Section 3.2.5 decimal
The Note in this section asks "Our design discussions did not reveal
convincing evidence of undue burden because of arbitrary precision decimal
numbers in this design, but we welcome further input from implementors".

I believe that you may want to consider the impact on implementors of a
query language based on this data type that must implement <predicate>s and
arithmetic operators for an "arbitary precision decimal number".  I believe
we will find this to be too expensive and that implementations will in fact
constrain the precision of this data type.  If the XML Schema specification
does not do this then interoperability will be heavily constrained.

I do not accept the argument that XML Schemas needs an arbitrarily precise
decimal datatype just to be able to model the length of names in XML which
are in turn unconstrained in length.

I suggest that the document be modified to state that the maximum precision
for decimal numbers should be an "implementation-defined number not less
than X" where X can be agreed upon by implementors as a practical lower
limit for this amount.   "Implementation-defined" means that a conforming
implementation must state in its conformance statement what the value is.
AM>> There is now a note in 3.2.5 asking for feedback on this issue.

6. Section 3.3.9 integer
The definition of the lexical representation of the integer datatype does
not correctly reflect that non-significant leading and trailing zeroes
should not be used.  Non-significant zeroes are leading zeroes to the left
of the decimal point or trailing zeroes to the right of the decimal point.
I suggest using this concept in the descriptive material.
AM>> This is a good addiition to integer.  For decimal, trailing zeroes
sometimes provide information.

7. Section 3.3.22 date
There is no specific definition in this specification of the value ranges
of the CC, YY, MM, and DD parts of a date.  Although this is probably
defined in ISO 8601 it would be preferable if this information was included
directly in this specification.  This comment also applies to the time
datatype.
AM>> Appendix D now contains this information.

/paulc

Paul Cotton, DB2 Language Architecture & Standards
IBM Canada Ltd, 17 Eleanor Drive, Nepean, Ontario K2E 6A3
Phone: (613) 225-5445   Fax:  (613) 226-6913
email: cotton@ca.ibm.com

Received on Friday, 14 April 2000 14:26:25 UTC