W3C home > Mailing lists > Public > xml-editor@w3.org > January to March 2000

Attribute normalisation

From: Richard Tobin <richard@cogsci.ed.ac.uk>
Date: Thu, 3 Feb 2000 14:21:43 GMT
Message-Id: <5544.200002031421@doyle.cogsci.ed.ac.uk>
To: xml-editor@w3.org
Cc: richard@cogsci.ed.ac.uk, ht@cogsci.ed.ac.uk
There is some uncertainty (discussed in xml-dev) about normalisation
of attributes containing character references.

Is the algorithm described in section 3.3.3 (updated in E24) applied
after entity expansion has already been done?  Presumably not, since
it includes processing entity references.  (It could mean that there
is an extra pass of entity expansion for attributes, but that would be
odd.)

Are the five bullet points in the algorithm intended to be mutually
exclusive alternatives for each character and reference, or are they
applied in sequence?  They appear to be alternatives, but some parsers
have interpreted it as meaning that the conversion of whitespace
characters to #x20 is done even to characters resulting from character
entity references.

To ensure that this is clear, how does the algorithm apply to this
example:

<!DOCTYPE el [
<!ELEMENT el ANY>
<!ATTLIST el at NMTOKENS #IMPLIED>
]>
<el at="a &#9; b"/>

Is the character reference converted to a space?

If it is not, is the document valid, and what value is returned to the
application?

If it is valid and the value returned is the sequence

  a space tab space b

then the normalisation has not had the (presumably desired) effect of
converting a tokenised atttribute to a sequence of NMTOKENs separated
by single space characters.

-- Richard
Received on Thursday, 3 February 2000 09:21:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:30 GMT