C14N2.0: ugly little details from Meiko Jensen on 2010-06-29 (public-xmlsec@w3.org from June 2010)

From: Meiko Jensen <Meiko.Jensen@ruhr-uni-bochum.de>
Date: 29 Jun 2010 13:55:08 +0200
To: "XMLSec WG Public List" <public-xmlsec@w3.org>
Message-ID: <4C29DF1C.2090707@ruhr-uni-bochum.de>

Hi all,

discussing the C14N2.0 with a student we found a very interesting little
detail of the actual C14N2.0 approach, which I'd like to point your
attention to.

Consider the Document

<X>
  <A>1 </A>
  <B> 2</B>
</X>

Note the whitespace behind 1 and before 2. DSIG2-Selection includes the
XPaths /X/A/text() and /X/B/text(), hence the C14N2.0 gets an input made
up by two text nodes: "1 " and " 2".
First, this is no XML any more. No problem so far. However, it is shaped
by two adjacent text nodes, so in the trimTextNodes=true case this
should result in a removal of leading and trailing whitespaces. For
performing C14N on each single text node, this would mean a C14N output
of "12", but since a parser would consider both text nodes to be
adjacent, hence "concatenatable", a more realistic output would be "1 
2", keeping the two spaces. So, the problem is to determine whether two
adjacent text nodes result from a parser splitting a single text node or
from two separate XPath selections.

After all, it means that concatenation of two C14Ned XML fragments is no
longer equivalent to C14N performed on their concatenation. Not sure
whether this may turn out to be a problem one day.

We could leave this open and go along with the "1  2" result as being
correct (which is most easy and inline with what we specified in "first
Selection, then C14N") or put a word or two on this in the spec or in
best practices.

What do you think?

Meiko

-- 
Dipl.-Inf. Meiko Jensen
Chair for Network and Data Security 
Horst Görtz Institute for IT-Security 
Ruhr University Bochum, Germany
_____________________________
Universitätsstr. 150, Geb. IC 4/150
D-44780 Bochum, Germany
Phone: +49 (0) 234 / 32-26796
Telefax: +49 (0) 234 / 32-14347
http:// www.nds.rub.de

Received on Tuesday, 29 June 2010 11:55:33 UTC