W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > April to June 2000

Attribute normalization

From: Kevin Regan <kevinr@valicert.com>
Date: Fri, 23 Jun 2000 15:26:23 -0700
To: John Boyer <jboyer@PureEdge.com>
Cc: w3c-ietf-xmldsig@w3.org
Message-id: <27FF4FAEA8CDD211B97E00902745CBE2015B7ADA@seine.valicert.com>

I have a question on section 4 of the XML C14N spec.  In this section,
it mentions the normalization of attributes:


Namespace and Attribute Nodes- a space, the node's QName, an equals
sign, an open double quote, the modified string value, and a close
double quote. The string value of the node is modified by replacing all
ampersands (&) with &amp;, all double quote characters with &quot;, and
the whitespace characters #x9, #xA, and #xD, with character references.
The character references are written in uppercase hexadecimal with no
leading zeroes (for example, #xD is represented by the character


However, when an XML processor reads in and parses an XML document, it
do the following (from XML 1.0 spec, section 3.3.3):


3.3.3 Attribute-Value Normalization
Before the value of an attribute is passed to the application or checked
for validity, the XML processor must normalize it as follows: 

-- a character reference is processed by appending the referenced
character to the attribute value 
-- an entity reference is processed by recursively processing the
replacement text of the entity 
-- a whitespace character (#x20, #xD, #xA, #x9) is processed by
appending #x20 to the normalized value, except that only a single #x20
is appended for a "#xD#xA" sequence that is part of an external parsed
entity or the literal entity value of an internal parsed entity 
--other characters are processed by appending them to the normalized

If the declared value is not CDATA, then the XML processor must further
process the normalized attribute value by discarding any leading and
trailing space (#x20) characters, and by replacing sequences of space
(#x20) characters by a single space (#x20) character.

All attributes for which no declaration has been read should be treated
by a non-validating parser as if declared CDATA


So, it seems that only #x20 characters will be seen in attribute values.
Why does the
spec mention the other values (#xD, #xA, #x9)?

Kevin Regan


Received on Friday, 23 June 2000 18:33:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:21:33 UTC