W3C home > Mailing lists > Public > whatwg@whatwg.org > November 2012

Re: [whatwg] main element parsing behaviour

From: Simon Pieters <simonp@opera.com>
Date: Wed, 07 Nov 2012 13:13:30 +0100
To: "Jirka Kosek" <jirka@kosek.cz>
Message-ID: <op.wnej8sdridj3kv@device-23f190>
Cc: whatwg <whatwg@whatwg.org>, Steve Faulkner <faulkner.steve@gmail.com>
On Wed, 07 Nov 2012 12:55:46 +0100, Jirka Kosek <jirka@kosek.cz> wrote:

> Changing parser each time new element is added is really evil idea and
> sign of a bad design.
>
> Parsing algorithm should be either not touched at all, or it should be
> promptly changed to treat all unknown elements in other way if the
> current treatment of unknown elements is not suitable for some reason.

There are three ways to parse a new element that we probably want for new  
elements:

"inline" - like <span>, current behavior for unknown elements.
"block" - like <address>, currently a finite list of elements.
"void" - like <img>, currently a finite list of elements.
(Possibly also "block void", - like <hr>, although none such elements have  
been added since parsing was specified.)

If we were to design a system where we can make up new elements that go in  
one of those categories without changing the parser, I think we  
effectively have to put a magic string in the tag name, e.g. any element  
that starts with "block" is treated like <address>, but that has  
disadvantages:

* Looking at a substring of the tag name complicates the parser and  
probably ruins some optimizations.
* It means new non-inline elements will have long, ugly two-word names  
which is inconsistent with the rest of the language.

I can imagine other designs as well but they don't seem any better.

In conclusion, I think changing the parser when we introduce a new "block"  
or "void" element is a better approach.

-- 
Simon Pieters
Opera Software
Received on Wednesday, 7 November 2012 12:14:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 January 2013 18:48:11 GMT