- From: Francois Yergeau <FYergeau@alis.com>
- Date: Thu, 20 Feb 2003 11:02:30 -0500
- To: John Boyer <JBoyer@PureEdge.com>, xml-editor@w3.org
- Cc: w3c-ietf-xmldsig@w3.org
John Boyer wrote: > On the other hand, the XML rule for element 'content' refers > to 'CharData', which only forbids the use of less-than (<) > and ampersand (&) in character content. Not quite. Production [14] reads: [14] CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*) The notation for productions is defined in section 6 (http://www.w3.org/TR/REC-xml#sec-notation), where we have in particular: [^abc], [^#xN#xN#xN] matches any Char with a value not among the characters given. where "Char" is a link to production [2] Char. This is from the second edition. The first edition said "matches any character with a value not among the characters given.", where "character" was a link to the definition of character just above production [2]: "A character is an atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are tab, carriage return, line feed, and the legal graphic characters of Unicode and ISO/IEC 10646." So the controls (except tab, carriage return, line feed) were already excluded. The change was first effected by E93 to the first edition (http://www.w3.org/XML/xml-19980210-errata#E93) published in July 2000. > The canonicalization > rule for text node processing was based on the CharData rule, > so it is possible to get a correct c14n program to write data > that Xerces cannot read and that is possibly not well-formed XML. Not by the above. > If there is an erratum that rewrites rule 14 in a manner > similar to the rule above (such that CharData reflects the > restrictions of Char), then the C14N Recommendation will need > an erratum to the processing model for text nodes. I don't think [14] needs an erratum, but C14N probably so. Regards, -- François Yergeau
Received on Thursday, 20 February 2003 11:03:18 UTC