RE: Possible XML and C14N errata from Martin Duerst on 2003-02-26 (w3c-ietf-xmldsig@w3.org from January to March 2003)

From: Martin Duerst <duerst@w3.org>
Date: Wed, 26 Feb 2003 13:03:00 -0500
To: "John Boyer" <JBoyer@PureEdge.com>, "Joseph Reagle" <reagle@w3.org>, <w3c-ietf-xmldsig@w3.org>
Cc: <FYergeau@alis.com>, <xml-editor@w3.org>
Message-Id: <4.2.0.58.J.20030226130021.03e3e6c8@localhost>

At 15:57 03/02/25 -0800, John Boyer wrote:
>Hi Joseph,
>
>Our implementation of C14N still runs into a bit of trouble because, 
>although you can't get the Xerces parser to read the bytes, there is 
>apparently nothing wrong with making a DOM call that sets the illegal 
>bytes into the content.  I already have to do other things to enforce the 
>Xpath data model on the DOM tree, so this is something I can take up in 
>implementation.

There are lots of other ways to produce illegal XML content using
the DOM. This was done for performance reasons.

>From: Joseph Reagle [mailto:reagle@w3.org]

>The expression "[^<&]* - ([^<&]* ']]>' [^<&]*)" is rather baroque on its
>face, and as John notes it's very weird that we have to read an augmented
>BNF grammar which itself references back to a production to understand the
>CharData production.

Yes, it's not easy to understand. But there are a lot of things
in the BNF used in the XML REC that don't turn up in your everyday
BNF, so I'd guess that every implementer at one point or another
will have to check it anyway.

Regards,   Martin.

Received on Wednesday, 26 February 2003 14:46:06 UTC