W3C home > Mailing lists > Public > public-xmlsec@w3.org > May 2010

Re: TrimTextNodes in Streaming parameter set

From: Meiko Jensen <Meiko.Jensen@ruhr-uni-bochum.de>
Date: 10 May 2010 14:24:28 +0200
Message-ID: <4BE7FAFC.3020706@ruhr-uni-bochum.de>
To: "Pratik Datta" <pratik.datta@oracle.com>
Cc: "XMLSec WG Public List" <public-xmlsec@w3.org>
Pratik,

my intention behind the streaming profile was to set defaults that are
optimized for usage in scenarios where performance really matters (such
as a Web Services Security Gateway). Hence, I tried to exclude all
options that require additional caching mechanisms or actions to be
performed on all SAX events. Here, trimTextNodes() to my consideration
raises complexity by far, since every event call must contain a check
for the whitespace cache, which most times will just be empty (at least
for regular use cases I'd expect this).

On the other hand,  hashing tons of ignorableWhitespaces also has its
performance impact. We will have to decide on whether we want a C14N
that is simpler to implement or (potentially) better in terms of
performance.

I do not insist on setting the trimTextNode default to false.
Personally, I really like that parameter because it makes XML Signatures
way more robust against accidental invalidation. I just see its
complexity as a potential performance and screw-it-up issue.

Meiko

Pratik Datta schrieb:
> I have a high level comment for the streaming.
>
> To implement XML Signature 2.0 in streaming mode, first you need a streaming XPath parser - this is pretty complex by itself and will need dedicated team of developers. Now if that is the case, then this team can implement the trim text nodes, using the "whitespace caching" technique that you outlined.
>
>
> My opinion is that there is nothing in the XML Signature 2.0 spec that cannot be streamed, assuming you have a solid development team. So do we really need a profile for streaming?  If you want simplicity you should use a DOM. Or are you suggesting that we should have a "simple-streaming" profile?
>
>
> Pratik
>
> -----Original Message-----
> From: Meiko Jensen [mailto:Meiko.Jensen@ruhr-uni-bochum.de] 
> Sent: Wednesday, May 05, 2010 4:14 AM
> To: Pratik Datta; XMLSec WG Public List
> Subject: TrimTextNodes in Streaming parameter set
>
> Hi Pratik,
>
> regarding the trimTextNodes parameter in my streaming proposal, here an
> example:
>
> <A>
>    <B>  stupid                                        
>
>
>
>
> example...
>
>
>
>
>   </B>
> </A>
>
> In SAX, this might end up with the contents of B being split
> to---say---3 separate characters() events. The first contains "stupid",
> hence removing the leading whitespaces is no issue. Trailing whitespaces
> already pose a problem: one can not be sure there's no non-whitespace
> text following. Hence, this requires caching the trailing whitespaces up
> to the point one can decide whether they are trailing or embedded
> whitespaces. Second characters() event only contains whitespaces. Still,
> we don't know whether we may safely discard them. However, I know at
> least one programmer who will implement the trimTextNodes method so that
> characters() events containing of whitespaces only will be discarded. We
> may add a hint to this issue in the spec, but it still remains somewhat
> tricky. Third characters() event: "example...". Now it turns out that
> the cached whitespaces were in fact embedded, not trailing. So we have
> to flush the cache to the c14n. Again, the trailing whitespaces trigger
> caching. Then, there comes an "endElement()" event of the B element.
> Here, it turns out that the cache can be discarded, as the contained
> whitespaces indeed were trailing ones. However, this results in that
> every event method must be implemented to take care not only of the
> event itself, but also on the whitespace cache.
>
> I know this is a rather constructed example, but the issue exists and
> may cause the "WTF happened here?" kind of bugs in real-world scenarios.
>
> Additionally, the issue is complicated a little by the
> ignoreWhitespaces() event that is used only by validating parsers, and
> would get called e.g. for the whitespaces between <A> and <B> in the
> example above. In fact, that's why I suggested to consider a third
> option (besides trim and noTrim) that would only erase
> ignorableWhitespaces(). However, that one does not work if used in
> non-validating parser environments (but could be emulated).
>
> That's why I proposed to set trimTextNodes=false.
>
> What do you think?
>
> best regards
>
> Meiko
>
>   

-- 
Dipl.-Inf. Meiko Jensen
Chair for Network and Data Security 
Horst Görtz Institute for IT-Security 
Ruhr University Bochum, Germany
_____________________________
Universitätsstr. 150, Geb. IC 4/150
D-44780 Bochum, Germany
Phone: +49 (0) 234 / 32-26796
Telefax: +49 (0) 234 / 32-14347
http:// www.nds.rub.de
Received on Monday, 10 May 2010 12:25:00 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 10 May 2010 12:25:01 GMT