Re: Comments on HTML WG face to face meetings in France Oct 08

On Nov 18, 2008, at 2:09 PM, Geoffrey Sneddon wrote:

>
>
> On 18 Nov 2008, at 21:33, Henri Sivonen wrote:
>
>> The Validator.nu HTML Parser supports SAX in two different modes:  
>> streaming and tree-buffered.
>>
>> In the streaming mode, the parser emits SAX events as it proceeds  
>> in the input stream. However, there are some types of authoring  
>> errors for which the error recovery is not streamable. These errors  
>> are treated like XML well-formedness errors. I'd like to emphasize  
>> that this behavior is conforming per spec:
>> http://www.whatwg.org/specs/web-apps/current-work/#parse-error
>>
>> In the tree-bufferend mode, the parser builds a tree using a  
>> purpose-optimized tree model (which is neither DOM nor XOM and  
>> outperforms Xerces2 DOM and XOM for this use case) and after the  
>> input stream has been exhausted, fires SAX events corresponding t
>
> Would it not be possible to not just buffer when you reach something  
> that cannot be said to certainly be the next event, or do you end up  
> buffering the entire document in such cases?

There would always be some buffering, since you can't know that an  
element and its children will have correctly nested close tags until  
you reach it's close tag. Until then, you have to buffer in case tree  
fixup is needed.

Regards,
Maciej

Received on Wednesday, 19 November 2008 00:08:07 UTC