Re: An HTML language specification

On Thu, Nov 20, 2008 at 6:51 PM, Philip Taylor <pjt47@cam.ac.uk> wrote:
> So it seem to me that the spec is already using a quite abstract view of the
> DOM. It uses the DOM interface names to identify the different types of node
> that can be generated, and to refer to the fields of each node (like
> 'data'), but otherwise it uses generic tree terminology. In particular it
> doesn't say anything like "Execute node.appendChild(lastNode)", which would
> be much more DOM-implementation-specific.
>
> People who have implemented the parsing algorithm have used a variety of
> non-DOM output structures (ElementTree and BeautifulSoup in html5lib, XOM
> and SAX in validator.nu, some purely functional tree structure in my OCaml
> implementation, etc) have never (as far as I'm aware) expressed concerns
> that the spec makes it unnecessarily difficult to use a non-DOM output
> format. (There are some necessary difficulties when the output format can't
> represent all HTML documents, e.g. if it requires XML-compatible element
> names or unbuffered streaming, but those issues will occur regardless of how
> the spec is written.)
>
> Does this increase or assuage your wariness at all?

Thanks for pointing that out, Philip.  Yes, that helps reduce my
concerns about the parser.  I'm still concerned about the language
definition though.

Mark.

Received on Friday, 21 November 2008 15:41:46 UTC