25 Feb Datatype comments

Here are some comments on XML Schema datatypes based on
a quick initial read.  Many of these comments are reiterations
of comments on previous versions.

Section 2.4.2.2: minlength and maxlength

The capitalization approach is different than 
for the minExclusive et al facets.

Section 2.4.2.13:

The period of facet should be timeDuration.

Section 2.5.1.2:

I think that the best answer to the example 
would be a list of length 1.
The answer 3 indicates breaking at line feeds before
assigning to list elements, the answer 18 implies
breaking at each space.  The answer 1 implies breaking
at the end of the production of the datatype.

Breaking at each space before checking type validity
would not allow you to support a list of quoted
strings that might contain spaces.

"String 1" "String 2"

If 18 is the answer to the example, then 5 would
be the answer to this.  If the list was derived
from quotedString (something like)

<simpleType name="quotedString">
   <pattern value='".*"'/>
</simpleType>

<simpleType name="quotedStrings" base="quotedString" derivedBy="list"/>

Then we would get the desired answer only if the end of the production
was the end of the first element.

I'm not sure of the interpretation of a enumeration for a list.

Section 3.2.3:

This starts the whole numeric rant again.  The current draft (as did the
17 Dec, but not the previous ones) do not provide an acceptible method
to communicate numerical values with greater range or more precision
than IEEE double.  If you wanted to express IEEE long double values, you
would either have to use double and risk truncation or use decimal
and possibly have to use several thousand characters and lose infinity and
NaN.

I strongly suggest that a something like the previous "real"
datatype be reintroduced (possibly called numeric since it is used
in the text) as the ultimate base type for all numeric types.

double and float could derive from numeric and add the IEEE ranges
and rounding behavior (numeric being arbitrary precision would
have no rounding effects).

decimal would derive from "numeric" by a restricting 
by a pattern="[0-9]*\.[0-9]*" (eliminating INF, NaN)

integer would derive from decimal by restricting by pattern="[0-9]*"

Section 3.2.3.1: Lexical Representation

It would simplify things for Java if Infinity were represented
by "Infinity" and "-Infinity" at least in addition to INF and -INF.



Section 3.2.7.1:

P0Y1347M0D is not allowed: The disallowance of trailing zero terms
complicates applications that write durations with no perceivable
simplification to applicates that consume durations.

Section 3.2.8:

A recurringInstant should recur with a specific time duration (say P24H)
not a time instance (2000-02-25T13:00:00Z).

There seems to be a strong desire to have a common base type for 
date and time which I believe is unnecessary.  However, if you must
then the proper base type is a recurringPeriod that has a period and
duration facet.  If the duration facet is xsi:null, it is an instant in 
time.  If the period facet is xsi:null, then it does not repeat.

time is a recurringPeriod with a timeInstance value of that time on
an arbitrary date, period=P24H and null duration.

2000 is a recurringPeriod with a timeInstance value of 2000-01-01T00:00:00,
a duration of P1Y and null period.

---04-15 is a recurringPeriod with a timeInstance value of 15 April
of an arbitrary year, a duration of P1D and a period of P1Y.

date should derive from recurringPeriod with a fixed duration of
P1D and null period.  That is a date is a specific day in history.
If omitted forms are desired then "recurringPeriod" may be used.

Section 3.3.22:
date is a time period not a timeInstant.

Section 5.2.5:

An enumeration facet like:

<enumeration open="0|1|true|false">
	<annotation/>
	<literal value="yellow">
		<annotation/>
	</literal>
</enumeration>

would allow open enumerations (enumeration that define 
some common values but do not constrain the type) and
would be consistent with the other facets that
can only appear once in a type definition (which would
allow the use of the <all> compositor to enforce it.


Appendix A:

The type definitions for "date" and "time" both have
period facets with a value of "00000T2400".  The value
is not a valid time duration (which would have been
P24H).  The period facet is defined in the text
as being a repeat interval (which is correct for time), but
a date doesn't repeat every 24 hours, it lasts 24 hours.

I think that there is a very strong case just to
support schema for schema for the <or>, <nor>, <and>
and <conform> facets that I proposed after the
last working draft that would allow you to say
that a data type is either one of the enumeration
##other, ##targetNamespece, ... or a uri for example.
I'll reinterate in another message.

Received on Tuesday, 29 February 2000 10:11:29 UTC