- From: Kevin Regan <kevinr@valicert.com>
- Date: Fri, 23 Jun 2000 15:26:23 -0700
- To: John Boyer <jboyer@PureEdge.com>
- Cc: w3c-ietf-xmldsig@w3.org
- Message-id: <27FF4FAEA8CDD211B97E00902745CBE2015B7ADA@seine.valicert.com>
I have a question on section 4 of the XML C14N spec. In this section, it mentions the normalization of attributes: ------------------------------------------------------- Namespace and Attribute Nodes- a space, the node's QName, an equals sign, an open double quote, the modified string value, and a close double quote. The string value of the node is modified by replacing all ampersands (&) with &, all double quote characters with ", and the whitespace characters #x9, #xA, and #xD, with character references. The character references are written in uppercase hexadecimal with no leading zeroes (for example, #xD is represented by the character reference
). -------------------------------------------------------- However, when an XML processor reads in and parses an XML document, it should do the following (from XML 1.0 spec, section 3.3.3): ---------------------------------------------------- 3.3.3 Attribute-Value Normalization Before the value of an attribute is passed to the application or checked for validity, the XML processor must normalize it as follows: -- a character reference is processed by appending the referenced character to the attribute value -- an entity reference is processed by recursively processing the replacement text of the entity -- a whitespace character (#x20, #xD, #xA, #x9) is processed by appending #x20 to the normalized value, except that only a single #x20 is appended for a "#xD#xA" sequence that is part of an external parsed entity or the literal entity value of an internal parsed entity --other characters are processed by appending them to the normalized value If the declared value is not CDATA, then the XML processor must further process the normalized attribute value by discarding any leading and trailing space (#x20) characters, and by replacing sequences of space (#x20) characters by a single space (#x20) character. All attributes for which no declaration has been read should be treated by a non-validating parser as if declared CDATA -------------------------------------------------------- So, it seems that only #x20 characters will be seen in attribute values. Why does the spec mention the other values (#xD, #xA, #x9)? Thanks, Kevin Regan kevinr@valicert.com
Received on Friday, 23 June 2000 18:33:05 UTC