Safe ways of implementing limits on buffer sizes in the parser

The spec allows implementations to place limits on the sizes of  
various things in HTML in order to avoid exhausting resources.

There are various buffers in the HTML5 parser all of which a remote  
site can fill arbitrarily much by choosing a suitable input. Has  
someone already pondered the security implications of the following  
strategies? That is, are either of these safe?

  1) Truncating a buffer from the end and leaving U+FFFD as the last  
character in the buffer.

  1) Truncating a buffer from the beginning and leaving U+FFFD as the  
first character in the buffer.

(It seems that dropping the buffer entirely is inconvenient e.g. when  
the buffer is an element name, although I guess it's an option for  
attribute values and element content.)

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Monday, 8 June 2009 15:11:40 UTC