- From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Date: Wed, 09 May 2001 17:50:12 -0700
- To: w3c-ietf-xmldsig@w3.org
In the summary in section 1.1, the eleventh point states: Special characters in attribute values and character content are replaced by character references This should probably state something like: Special characters in attribute values and character content are replaced by predefined entity references, or if no such reference is defined, by hexadecimal character references. The exact wording can be worked on. However, most of the time canonicalization replaces special characters in attribute values and character content with entity references, not with character references. As currently written the statement implies that a character such < would be replaced by <. However, section 2.3 gives the more complete description: Attribute Nodes- a space, the node's QName, an equals sign, an open quotation mark (double quote), the modified string value, and a close quotation mark (double quote). The string value of the node is modified by replacing all ampersands (&) with &, all open angle brackets (<) with <, all quotation mark characters with ", and the whitespace characters #x9, #xA, and #xD, with character references. The character references are written in uppercase hexadecimal with no leading zeroes (for example, #xD is represented by the character reference 
). Text Nodes- the string value, except all ampersands are replaced by &, all open angle brackets (<) are replaced by <, all closing angle brackets (>) are replaced by >, and all #xD characters are replaced by 
. Note that section 2.4 of the XML 1.0 spec, 2nd edition, clearly indicates that entity references are not a special kind of character reference: Text consists of intermingled character data and markup. [Definition: Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, comments, CDATA section delimiters, document type declarations, processing instructions, XML declarations, text declarations, and any white space that is at the top level of the document entity (that is, outside the document element and not inside any other markup).] Or from, the BNF Grammar: Reference ::= EntityRef | CharRef -- +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | Java I/O (O'Reilly & Associates, 1999) | | http://metalab.unc.edu/javafaq/books/javaio/ | | http://www.amazon.com/exec/obidos/ISBN=1565924851/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://metalab.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/ | +----------------------------------+---------------------------------+
Received on Wednesday, 9 May 2001 17:57:16 UTC