Absence of "whiteSpace" facet from jamieson_william@jpmorgan.com on 2000-11-16 (www-xml-schema-comments@w3.org from October to December 2000)

From: <jamieson_william@jpmorgan.com>
Date: Thu, 16 Nov 2000 16:59:50 +0000
To: www-xml-schema-comments@w3.org
Message-ID: <OF35FBCE08.25260FF2-ON80256999.0044D1E9@uk.jpmorgan.com>

Hi,

firstly, congratulations on the quality of the (proposed) XSD
specification.

I have a question about the "whiteSpace" facet for which I can not see an
obvious answer in the documents ...

My interpretation of this facet is that rather than being a constraining
facet (i.e. whether an initial value can/not contain white space) it
defines how an initial value will be normalised before validation (e.g. of
length, enumerations etc.) is applied. In other words it directs the parser
to ignore,collapse or observe the whitespace when validating the content.
In effect it safes having the translate the document (in order to massage
whitespace) prior to validation.

On the assumption that the this interpretation is correct (living
dangerously here) what is the impact if no "whiteSpace" facet appears in
the simpleType definition? Can I assume that "preserve" is the default
behaviour?

I hope you find the attached comments/opinions useful, please respond if
you think I can add any value by expanding on them.
regards,
William

PS: regarding the priority feedback requests ...

1. From the Structures 3.3 ... We will not use xsi:null in any of our
schema (used for message validation). From a simple performance viewpoint
it is preferable that an element not be present rather than be present but
empty.

2. From Structures 4.3.3 ... From our perspective fixed ordering (the
"sequence" group) of elements in an XML document's content (as opposed to
its presentation) is an anathema. Unfortunately the restrictions that seem
to apply to the "all" group (i.e. must be the sole top level child, must
contain only individual elements) seem to largely negate its value. We
would like to declare a complex type's content model such that it contains
zero or one instances of simple elements a,b and c in any order and zero or
more instances of a complexElement. Schedules (mentioned in 6 below) are a
good example - a schedule contains a set of parameters (0 or 1 instances of
simple elements) such as the start date, end date, frequency, boundary
conditions and then optionally zero or more instances of the actual
scheduled events. In this example we want the schedule's parameters to be
enveloped in an "all" group but the 0..n scheduled events to be outside
this group.
While I am on the subject ... I think that it is a reasonable constraint
that if a complex element contains zero or one instances of elements a and
b and 10 instances of c then elements a, b and all instances of c (en-mass)
could appear in any order but elements a and b could not be interspersed
amongst the 10 d's (i.e. if multiple instances of d are permitted then they
must be sequential).

3. From the Data Types document ... "Ed. Note: (PVB) Do we want to make the restriction that there has to be more than one type in a union? It was in
the proposal, but I don't think it should be an error if only one appears. "
I agree with the editor's view

4. From Data Types, section 3.2.5 ... a minimum of 18 digit capacity for a
decimal is OK.

5. From Data Types, section 3.2.6 ... ordering of timeDuration is
meaningless and should not be implemented. Time has a natural order so it
is valid to do order--related operations such as compare them, find the
displacement/difference between 2 of them etc. A timeDuration such as a
Month, 2Months&5Days etc. is a container of smaller units of time (e.g.
days), it may be valid to say that one container is bigger than another but
this is not the same thing as order.

6. From Data Types, section 3.2.7 ... Feedback re recurringDuration et.al.
The following may provide you with some insight into this issue from the
perspective of the Banking industry (which is replete with such things).
Events such as rate observations, rate reset, cashflow refixes, option
exercises etc. occur on a scheduled basis. These schedules are defined as
being recurring intervals (ie. timeDurations in xsd parlance), a start
condition (usually a start date and what to do if that date is a holiday)
and roll-over (or boundary) behaviour that applies between the recurring
intervals. The recurring intervals can be the usual integer number of
calendar years, months, days and combinations therefore but they can also
be more obcure (e.g. lunar months, IMM months, 3rdWednesdayOf
TheMonth-3rdWednesdayOf TheMonth etc.). There are also various flavours of
a "year", a calendar year, a 365day year, a 366day year, a 360day year
depending on locale, market etc. (all clearly outside the ISO8601 remit).
The boundary conditions can be complex, examples include what to do if the
end of the interval is a holiday, if it is both is a holiday and month-end,
what to do when end is specified as the nth day of the month but the month
has <n days, whether to start the next interval immediately, offset it or
adjust its start date relative the original start date of the schedule.

This e-mail message is CONFIDENTIAL and may contain legally privileged
information. If you are not the intended recipient you should not read,
copy, distribute, disclose or otherwise use the information in this e-mail.
Please also telephone or fax us immediately and delete the message from
your system. E-mail may be susceptible to data corruption, interception
and unauthorised amendment, and we do not accept liability for any such
corruption, interception or amendment or the consequences thereof.

Received on Thursday, 16 November 2000 12:17:35 UTC