Re: Normalization (whiteSpace)

Elena Litani wrote:

> Hi,
>
> I am reading Datatypes [2.4.2.6 whiteSpace]:
>
> "For all atomic datatypes other than string (and types derived by
> restriction from it) the value of whiteSpace is collapse and cannot be
> changed by a schema author; for string the value of whiteSpace is
> preserve; for any type derived by restriction from string the value of
> whiteSpace can be any of the three legal values".
>
> I believe the whiteSpace value for CDATA should be "replace" [cannot be
> changed by a schema author]. For TOKEN (and datatypes derived from it)
> the value of whiteSpace should be collapse [cannot be changed by a
> schema author].

The schema for Datatypes agree with you Elena with the facet <whitespace
value="replace"/>, but does not agree with you in respect to fixed="true".
Same goes for TOKEN.  But now that you mention it, why is it "replace" for
CDATA!  Doesn't XML expect all characters to be preserved! Ie shouldn't it
be:
<whitespace value="preserve" fixed="true"/>

> Am I right?
> If so, why section 2.4.2.6 mentions nothing about that?

It does.  "for any type derived by restriction from string the value of
whiteSpace can be any of the three legal values".  Both TOKEN and CDATA are
derived by restriction from string and indeed do have one of the three
legal values.
There are many derived atomic datatypes that are derived by restriction
from string, I'm curious why you've singled out TOKEN and CDATA only?

mick.

Received on Tuesday, 6 February 2001 20:43:33 UTC