- From: Richard Tobin <richard@cogsci.ed.ac.uk>
- Date: Thu, 3 Feb 2000 14:21:43 GMT
- To: xml-editor@w3.org
- Cc: richard@cogsci.ed.ac.uk, ht@cogsci.ed.ac.uk
There is some uncertainty (discussed in xml-dev) about normalisation of attributes containing character references. Is the algorithm described in section 3.3.3 (updated in E24) applied after entity expansion has already been done? Presumably not, since it includes processing entity references. (It could mean that there is an extra pass of entity expansion for attributes, but that would be odd.) Are the five bullet points in the algorithm intended to be mutually exclusive alternatives for each character and reference, or are they applied in sequence? They appear to be alternatives, but some parsers have interpreted it as meaning that the conversion of whitespace characters to #x20 is done even to characters resulting from character entity references. To ensure that this is clear, how does the algorithm apply to this example: <!DOCTYPE el [ <!ELEMENT el ANY> <!ATTLIST el at NMTOKENS #IMPLIED> ]> <el at="a 	 b"/> Is the character reference converted to a space? If it is not, is the document valid, and what value is returned to the application? If it is valid and the value returned is the sequence a space tab space b then the normalisation has not had the (presumably desired) effect of converting a tokenised atttribute to a sequence of NMTOKENs separated by single space characters. -- Richard
Received on Thursday, 3 February 2000 09:21:47 UTC