W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] Bug in "Before DOCTYPE name state"?

From: Thomas Broyer <t.broyer@gmail.com>
Date: Fri, 22 Dec 2006 08:38:48 +0100
Message-ID: <a9699fd20612212338r4c3dc4d7ud5af82e89a49e66@mail.gmail.com>
2006/12/22, Ian Hickson:
> On Thu, 21 Dec 2006, Thomas Broyer wrote:
> >
> > Why is the DOCTYPE marked "in error" in the former case?
> Because otherwise this document:
> ...would emit a DOCTYPE that is not in error (since the token would be
> emitted before the bit at the end of the DOCTYPE name state).

Doh! right.

> > In other words, why would <!DOCTYPE html> be "in error" while
> > <!DOCTYPE Html> wouldn't?
> Both would be not in error, because of the sentence at the end of the
> DOCTYPE name state.

OK, now understood (thanks you Simon for having enlighted me)

> On Thu, 21 Dec 2006, Thomas Broyer wrote:
> >
> > But it also has this note, which is quite confusing: "Because lowercase
> > letters in the name are uppercased by the algorithm above, the "HTML"
> > letters are actually case-insensitive relative to the markup."
> How is it confusing? I would clarify it, but I don't know what is
> confusing.

Maybe there's no need to clarify it, it might just have been me?

> > It remains that the tokenization stage is a bit confusing?
> Yes. The tree construction stage is even worse. Just implement it exactly
> as written with no interpretation and you should be fine. ;-)

My "problem" is that I'm not implementing an "emitting" parser (? la
SAX) but a "pulling" parser, so I'm stopping as soon as I've found a
token and return true to say "hey, I've changed the TokenType, Name,
Value, etc. properties to reflect a new token".
...so I'm interpreting ;-)

Re tree construction, I'm about to implemented it in two parts: in the
"pull parser" when possible (handling omitted tags and misnested
formatting elements) and in a "tree fixer" otherwise (move the <meta>
and <link> into <head>, etc.)

Thomas Broyer
Received on Thursday, 21 December 2006 23:38:48 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:51 UTC