- From: Chris Lilley <chris@w3.org>
- Date: Fri, 29 Nov 2002 12:57:53 +0100
- To: www-tag@w3.org, "Mark Nottingham" <mnot@mnot.net>
- CC: "Tim Bray" <tbray@textuality.com>, "Paul Grosso" <pgrosso@arbortext.com>
On Monday, November 25, 2002, 11:15:06 PM, Mark wrote: Tim Bray (I believe) wrote: >> - That granted, forbidding an internal subset seems kind of dumb. >> Speaking as an XML processor implementor, the extra code required is >> hardly detectable and the performance gain not significiant. >> Furthermore, every XML processor in the world just silently does the >> internal subset and it's going to cost *extra work* for SOAP >> implementations to check that they haven't. I.e. you can't use an >> ordinary off-the-shelf non-validating XML processor. MN> Perhaps the WG has a good reason for this prohibition; have they been MN> asked? I discussed this with Yves Lafon the other day. The argument that a message in a protocol has to be self-standing is fairly compelling - it can't just buffer up while some other resource is fetched (perhaps using a different protocol). I have seen similar arguments for SVG Tiny - although it is deployed in a bandwidth-challenged environment, bandwidth is not the major issue and an SVG Tiny SVG file may well be larger than an equivalent SVG Full if it includes raster images (using the data: protocol and base64 encoding) because an MMS message that ses SVG needs to be stand-alone and convey all the resources it will need for display. The latency of the protocol means that fetching a secondary resource would give noticeable lag times (several seconds, rather than several tens of milliseconds) and, in the case of phone to phone messaging (in other words, P2P) there *is* no server that can be asked for any secondary resources - its a one-shot push not a pull. The other argument that I have heard is the twin security holes of a) an external parsed entity that deliberately is large, or on a server that deliberately operates at a few bytes a minute b) a power-series entity expansion DoS attack: <!ENTITY x "abcdefg"> <!ENTITY x2 "&x;&x;"> <!ENTITY x3 "&x2;&x2;"> <!ENTITY x4 "&x3;&x3;"> <!ENTITY x5 "&x4;&x4;"> <!ENTITY x6 "&x5;&x5;"> <!ENTITY x7 "&x6;&x6;"> <!ENTITY x8 "&x7;&x7;"> <!ENTITY x9 "&x8;&x8;"> <!ENTITY xa "&x9;&x9;"> <!ENTITY xb "&xa;&xa;"> <!ENTITY xc "&xb;&xb;"> <!ENTITY xd "&xc;&xc;"> <!ENTITY xe "&xd;&xd;"> <!ENTITY xf "&xe;&xe;"> x is 8 bytes long (in UTF-8; 16 bytes as a DOMString) xf is 128kb long (256k as a DOM string). It would be fairly trivial to have an entity that was some terrabytes in size using this method. But then I ask myself - does SOAP prohibit messages that are over large due to the simpler and less devious method of just having a very large message? -- Chris mailto:chris@w3.org
Received on Friday, 29 November 2002 06:57:58 UTC