[whatwg] Parsing: Tokenisation - DOCTYPE State from Ian Hickson on 2006-01-31 (public-whatwg-archive@w3.org from January 2006)

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 31 Jan 2006 20:24:38 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0601312017300.2856@dhalsim.dreamhost.com>

On Sun, 29 Jan 2006, Lachlan Hunt wrote:
>
>   I believe there are some mistakes in the DOCTYPE state section.
> 
> As far as I can tell both of these DOCTYPEs are considered conformant, but
> shouldn't the first be an easy parse error?
> 
>   <!DOCTYPEhtml>
>   <!DOCTYPE html>

Yeah. Fixed. They both still generate the same DOM but the first causes an 
error to be flagged.
 

> * That should read "[subtract] 0x0020 to the character's codepoint"
>   (This error is repeated in the DOCTYPE name state too.)

Fixed. Though I'm not sure we want to be doing this really. I'm torn.


> * Why is it marked as being error at that stage?  It doesn't seem to
>   be necessary because of the last step in the DOCTYPE name state that
>   says:
>   "If the name of the DOCTYPE token is exactly the four letters "HTML",
>    then mark the token as being correct. Otherwise, mark it as being in
>    error."

It's mostly just for the case of an EOF during the DOCTYPE name state.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 31 January 2006 12:24:38 UTC