W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: error recovery

From: Noah Mendelsohn <nrm@arcanedomain.com>
Date: Sun, 19 Feb 2012 14:00:05 -0500
Message-ID: <4F4146B5.7000509@arcanedomain.com>
To: liam@w3.org
CC: David Lee <David.Lee@marklogic.com>, Norman Walsh <ndw@nwalsh.com>, W3C XML-ER Community Group <public-xml-er@w3.org>

On 2/19/2012 1:13 PM, Liam R E Quin wrote:
> Actually the worst case I've encountered in XML is
> <a b:att1="v1" b:att2="v2" ... [a gigabyte of attributes followed by]
>       b:attFFFF="vFFFF" xmlns:b="http://example.org/"  />
>
> You may have to buffer all the attributes until you get to the namespace
> declaration. In practice this isn't really an issue for a Web browser,
> or for anything else constructing a tree, because you have to keep them
> anyway.

Yeah, when we built our high performance XML parse a few years ago, this 
was something we spent a lot of time designing around, albeit as an edge 
case. It's also one of the reasons that things like LALR parsers tend to 
have trouble dealing with XML, as I recall.  A related example is:

<b:a att1="v1" att2="v2" ... [a gigabyte of attributes followed by] 
attFFFF="vFFFF" xmlns:b="http://example.org/"  />


Really screws up naive approaches to streaming the matching of content models.

Noah
Received on Sunday, 19 February 2012 19:00:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 19 February 2012 19:00:30 GMT