- From: John Boyer <jboyer@PureEdge.com>
- Date: Thu, 14 Sep 2000 15:03:09 -0700
- To: <xml-editor@w3.org>
- Message-ID: <BFEDKCINEPLBDLODCODKEECDCFAA.jboyer@PureEdge.com>
The rules for attribute value normalization need to be looked at yet again. Yes I have seen Erratum E70, and it is fine as far as it goes. The issue I am raising here is that the method of attribute value normalization does not make sense and should be improved for the sake of design consistency. Consider an attribute containing whitespace sprinkled throughout an attribute of type other than CDATA. According to both the XML spec and Erratum E70, the whitespace is further normalized by stripping all leading and trailing whitespace and by reducing all strings of consecutive whitespace chars to a single space. I think that this whitespace normalization scheme only works for NMTOKENS. The remaining non-CDATA attribute types like ID would benefit from just taking all of the whitespace out period, as well as any other characters not permitted by the Name production. Otherwise, why bother normalizing these attributes at all since the result after normalization is that the validity constraints are still being violated? This issue comes up in determining the difference between validating and non-validating processors when generating a canonical form for XML to be used by an XML signature. If the signer uses a non-validating processor, he may be able to create a signature over data that cannot be validated by a verifier who uses a validating processor. Sure it's a weird case where the signer gets what he pays for, but it's also a consistency thing. John Boyer PureEdge Solutions Inc. jboyer@PureEdge.com
Received on Thursday, 14 September 2000 18:03:09 UTC