RE: Possible XML and C14N errata from John Boyer on 2003-02-21 (w3c-ietf-xmldsig@w3.org from January to March 2003)

From: John Boyer <JBoyer@PureEdge.com>
Date: Fri, 21 Feb 2003 10:38:23 -0800
To: "Joseph Reagle" <reagle@w3.org>, <xml-editor@w3.org>
Cc: <w3c-ietf-xmldsig@w3.org>
Message-ID: <7874BFCCD289A645B5CE3935769F0B52452A7A@tigger.pureedge.com>

Hi Joseph,

I didn't get an email with Francois's interpretation; could you please forward it to DSig?

In answer to your question, yes I believe you can get the forbidden characters into a string if they are encoded as character references in the input file.

Either way, the rule for CharData in XML should certainly be changed.  The immediately following rule for Comment syntax references Char, so one is quite easily lead to believe that CharData can accept the characters forbidden by Char because if CharData is supposed to forbid them, then it would also reference the Char production.  The text around CharData leads me to believe that this was the intent, but the BNF rule is normative, just like the surrounding text, but the two don't say the same thing.  Moreover, an experienced developer using the XML spec as a reference is more likely to read the BNF rule, so it is very important that it be correct.

John Boyer, Ph.D.
Senior Product Architect
PureEdge Solutions Inc.

-----Original Message-----
From: Joseph Reagle [mailto:reagle@w3.org]
Sent: Friday, February 21, 2003 10:30 AM
To: John Boyer; xml-editor@w3.org
Cc: w3c-ietf-xmldsig@w3.org
Subject: Re: Possible XML and C14N errata

On Friday 14 February 2003 16:15, John Boyer wrote:
> On the other hand, the XML rule for element 'content' refers to
> 'CharData', which only forbids the use of less-than (<) and ampersand (&)
> in character content.  The canonicalization rule for text node processing
> was based on the CharData rule, so it is possible to get a correct c14n
> program to write data that Xerces cannot read and that is possibly not
> well-formed XML.

I'm not sure what you mean by "based on", but if one accepts Francois's 
interpretation -- these chars are already excluded -- does anything have to 
change in c14n? The c14n for a text node takes the string value of the 
XPath text node and escapes '#xD', '<', and '>', but if these are precluded 
characters from the start, they wouldn't have appeared in an XPath text 
node, right?

Received on Friday, 21 February 2003 13:38:59 UTC