- From: by way of <staschuk@telusplanet.net>
- Date: Tue, 24 Dec 2002 11:28:55 -0700
- To: W3C XML Schema Comments list <www-xml-schema-comments@w3.org>
_Part 2: Datatypes_ defines canonical lexical representations for most of the built-in simple types, but their use is unclear. I'd like to see some amplification on this point in 1.1. Trolling through the archives, I find a suggestion that canonicalization is useful in the context of signed XML, when intermediate parties in a transaction might replace one lexical representation with a different but equivalent one, and it is desired that this not invalidate the signature. This is a worthwhile goal, but it seems impossible to canonicalize a document without special knowledge of every type in the document. For a silly example, consider the type <simpleType name='onTheHour'> <restriction base='dateTime'> <pattern value='.*T..:00.*'/> </restriction> </simpleType> which requires the minute field of its values to be zero. Canonicalizing values of this type in general is impossible without special knowledge of the type: an algorithm for canonicalizing dateTimes in general cannot be used since conversion of an onTheHour value to UTC might change the minutes field and make the result invalid for onTheHour. So, if canonical lexical representations cannot be used by a generic processor to canonicalize a document, then what are they for? Only the processors with special knowledge? While I'm at it, why isn't canonical form a facet of the type? Incidentally, the above example, silly as it is, illustrates an important respect in which values of a type derived by restriction cannot be treated by a generic processor as values of the base type. It is a bit surprising that there are any such respects at all (if, like me, you are coming from an object-oriented view of "type"); I think this point deserves some commentary in 1.1. -- Steven Taschuk | o- @ staschuk@telusplanet.net | 7O ) | " ( Hummingbird
Received on Tuesday, 24 December 2002 14:15:05 UTC