W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > April to June 2000

RE: Attribute normalization

From: John Boyer <jboyer@PureEdge.com>
Date: Fri, 23 Jun 2000 15:48:11 -0700
To: "Kevin Regan" <kevinr@valicert.com>
Cc: <w3c-ietf-xmldsig@w3.org>
Hi Kevin,

Good question.  #xD, #xA and #x9 can appear in a normalized attribute value
if they were created by character references.
This would not appear to be the case from the XML 1.0 spec (specifically the
passage you cited), but please see the XML errata [1,2].

[1] http://www.w3.org/XML/xml-19980210-errata#E24
[2] http://www.w3.org/XML/xml-19980210-errata#E61

John Boyer,
Software Development Manager

PureEdge Solutions (formerly UWI.Com)
Creating Binding E-Commerce

v:250-479-8334, ext. 143 f:250-479-3772
1-888-517-2675  http://www.PureEdge.com

-----Original Message-----
From: w3c-ietf-xmldsig-request@w3.org
[mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Kevin Regan
Sent: Friday, June 23, 2000 3:26 PM
To: John Boyer
Cc: w3c-ietf-xmldsig@w3.org
Subject: Attribute normalization

I have a question on section 4 of the XML C14N spec.  In this section,
it mentions the normalization of attributes:


Namespace and Attribute Nodes- a space, the node's QName, an equals
sign, an open double quote, the modified string value, and a close
double quote. The string value of the node is modified by replacing all
ampersands (&) with &amp;, all double quote characters with &quot;, and
the whitespace characters #x9, #xA, and #xD, with character references.
The character references are written in uppercase hexadecimal with no
leading zeroes (for example, #xD is represented by the character


However, when an XML processor reads in and parses an XML document, it
do the following (from XML 1.0 spec, section 3.3.3):


3.3.3 Attribute-Value Normalization
Before the value of an attribute is passed to the application or checked
for validity, the XML processor must normalize it as follows:

-- a character reference is processed by appending the referenced
character to the attribute value
-- an entity reference is processed by recursively processing the
replacement text of the entity
-- a whitespace character (#x20, #xD, #xA, #x9) is processed by
appending #x20 to the normalized value, except that only a single #x20
is appended for a "#xD#xA" sequence that is part of an external parsed
entity or the literal entity value of an internal parsed entity
--other characters are processed by appending them to the normalized

If the declared value is not CDATA, then the XML processor must further
process the normalized attribute value by discarding any leading and
trailing space (#x20) characters, and by replacing sequences of space
(#x20) characters by a single space (#x20) character.

All attributes for which no declaration has been read should be treated
by a non-validating parser as if declared CDATA


So, it seems that only #x20 characters will be seen in attribute values.
Why does the
spec mention the other values (#xD, #xA, #x9)?

Kevin Regan

Received on Friday, 23 June 2000 18:48:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:21:33 UTC