RE: Attribute normalization from John Boyer on 2000-06-23 (w3c-ietf-xmldsig@w3.org from April to June 2000)

From: John Boyer <jboyer@PureEdge.com>
Date: Fri, 23 Jun 2000 15:48:11 -0700
To: "Kevin Regan" <kevinr@valicert.com>
Cc: <w3c-ietf-xmldsig@w3.org>
Message-ID: <BFEDKCINEPLBDLODCODKCEHHCDAA.jboyer@PureEdge.com>

Hi Kevin,

Good question.  #xD, #xA and #x9 can appear in a normalized attribute value
if they were created by character references.
This would not appear to be the case from the XML 1.0 spec (specifically the
passage you cited), but please see the XML errata [1,2].

[1] http://www.w3.org/XML/xml-19980210-errata#E24
[2] http://www.w3.org/XML/xml-19980210-errata#E61

Thanks,
***************************************
John Boyer,
Software Development Manager

PureEdge Solutions (formerly UWI.Com)
Creating Binding E-Commerce

v:250-479-8334, ext. 143 f:250-479-3772
1-888-517-2675  http://www.PureEdge.com
***************************************



-----Original Message-----
From: w3c-ietf-xmldsig-request@w3.org
[mailto:w3c-ietf-xmldsig-request@w3.org]On Behalf Of Kevin Regan
Sent: Friday, June 23, 2000 3:26 PM
To: John Boyer
Cc: w3c-ietf-xmldsig@w3.org
Subject: Attribute normalization



I have a question on section 4 of the XML C14N spec.  In this section,
it mentions the normalization of attributes:

-------------------------------------------------------

Namespace and Attribute Nodes- a space, the node's QName, an equals
sign, an open double quote, the modified string value, and a close
double quote. The string value of the node is modified by replacing all
ampersands (&) with &amp;, all double quote characters with &quot;, and
the whitespace characters #x9, #xA, and #xD, with character references.
The character references are written in uppercase hexadecimal with no
leading zeroes (for example, #xD is represented by the character
reference&#xD;).

--------------------------------------------------------

However, when an XML processor reads in and parses an XML document, it
should
do the following (from XML 1.0 spec, section 3.3.3):

----------------------------------------------------

3.3.3 Attribute-Value Normalization
Before the value of an attribute is passed to the application or checked
for validity, the XML processor must normalize it as follows:

-- a character reference is processed by appending the referenced
character to the attribute value
-- an entity reference is processed by recursively processing the
replacement text of the entity
-- a whitespace character (#x20, #xD, #xA, #x9) is processed by
appending #x20 to the normalized value, except that only a single #x20
is appended for a "#xD#xA" sequence that is part of an external parsed
entity or the literal entity value of an internal parsed entity
--other characters are processed by appending them to the normalized
value

If the declared value is not CDATA, then the XML processor must further
process the normalized attribute value by discarding any leading and
trailing space (#x20) characters, and by replacing sequences of space
(#x20) characters by a single space (#x20) character.

All attributes for which no declaration has been read should be treated
by a non-validating parser as if declared CDATA

--------------------------------------------------------

So, it seems that only #x20 characters will be seen in attribute values.
Why does the
spec mention the other values (#xD, #xA, #x9)?

Thanks,
Kevin Regan

kevinr@valicert.com

Received on Friday, 23 June 2000 18:48:16 UTC