- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Tue, 16 Jul 2002 12:23:43 +0200
- To: www-xml-schema-comments@w3.org
I am confused by the definitions of the built-in types normalizedString and its subtypes, in Schema Part 2. (1). The value space of normalizedString allows all characters except xD, xA, and x9. The lexical space allows all characters except xD and x9. What is the mapping from the lexical space to the value space: what happens to an xA character in the lexical space (is it removed? replaced by an x20?). The canonical lexical representation, presumably, is the same as the string in the value space: I think we should be told. Presumably the lexical space represents the value after the XML parser has done its normalization. So in practice, a tab character is allowed in an attribute of type normalizedString (because the XML parser will turn it to a space), but a tab character is not allowed in an element of type normalizedString (because the XML parser will leave it unchanged). Is this interpretation correct? I find it hard to understand why the lexical space doesn't allow any string, with a mapping to the value space achieved by normalizing whitespace characters. Alternatively, the lexical space should be identical to the value space. The current definition seems nonsensical. (2). The type "token" ("tokens" would have been a better name) says that the value space allows all characters except xA or x9. But since it is a restriction of normalizedString, it actually appears to allow all characters except xA, xD, or x9. If the restriction is going to be restated here, it should be restated in full. (3). The three subtypes of "token" do not allow any whitespace characters in the value. Why is there no supertype for these ("token" would have been a good name) that allows any string containing no whitespace characters? I would have thought this type would be vastly more useful than most of the other built-in subtypes of string. Michael Kay Software AG
Received on Tuesday, 16 July 2002 06:23:47 UTC