Re: HTML 4 Profile for RDFa

On 23 May 2009, at 13:34, Julian Reschke wrote:

>>>> For this to make sense in real HTML implementations, the  
>>>> definition should be in terms of the document layer rather than  
>>>> the byte layer.
>>>
>>> Disagreed. Many implementations never build a DOM. We're not only  
>>> talking about browsers here.
>> By "DOM" I generally mean any kind of tree structure of elements  
>> and attributes, either as an explicit data structure (DOM, XOM,  
>> ElementTree) or implicit (SAX). Would any RDFa implementation *not*  
>> parse the input HTML into that kind of structure and operate over  
>> the elements and attributes as distinct objects? (e.g. would they  
>> just use regular expressions over the input byte stream? That seems  
>> quite infeasible to me...)
>
> Depends on the definition of "tree structure". I've been involved in  
> code that just uses a tokenizer and specialized stack, and  
> implementations like these will not do the re-arranging of elements  
> the HTML5 spec specifies for some kinds of broken input.

Still specifying it relative to a DOM is still not problem, as you can  
incur the elements and text nodes from the token stream, until you  
reach the point where you are required by HTML 5 to throw a fatal  
error (i.e., when you can no longer parse per spec with the stream, as  
you can't reorder the elements).


--
Geoffrey Sneddon
<http://gsnedders.com/>

Received on Saturday, 23 May 2009 12:58:21 UTC