On 23 May 2009, at 13:34, Julian Reschke wrote: >>>> For this to make sense in real HTML implementations, the >>>> definition should be in terms of the document layer rather than >>>> the byte layer. >>> >>> Disagreed. Many implementations never build a DOM. We're not only >>> talking about browsers here. >> By "DOM" I generally mean any kind of tree structure of elements >> and attributes, either as an explicit data structure (DOM, XOM, >> ElementTree) or implicit (SAX). Would any RDFa implementation *not* >> parse the input HTML into that kind of structure and operate over >> the elements and attributes as distinct objects? (e.g. would they >> just use regular expressions over the input byte stream? That seems >> quite infeasible to me...) > > Depends on the definition of "tree structure". I've been involved in > code that just uses a tokenizer and specialized stack, and > implementations like these will not do the re-arranging of elements > the HTML5 spec specifies for some kinds of broken input. Still specifying it relative to a DOM is still not problem, as you can incur the elements and text nodes from the token stream, until you reach the point where you are required by HTML 5 to throw a fatal error (i.e., when you can no longer parse per spec with the stream, as you can't reorder the elements). -- Geoffrey Sneddon <http://gsnedders.com/>Received on Saturday, 23 May 2009 12:58:20 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 23 May 2009 12:58:20 GMT