- From: Ian Hickson <ian@hixie.ch>
- Date: Sat, 13 Dec 2008 01:15:16 +0000 (UTC)
On Fri, 12 Dec 2008, Jonas Sicking wrote: > > Currently tokenizing the following string (starting at Data state) > "<!--foo" results in a parse error when hitting the 'f'. As far as I can tell, it does not. Assuming we're in a normal state of affairs, here are the states we visit and the parse errors we emit as we are consuming the characters: 1. Data state: consume "<" and switch to tag open state. 2. Tag open state: consume "!" and switch to markup declaration open state. 3. Markup declaration open state: see "--" next, consume those characters, and switch to comment start state. 4. Comment start state: consume anything else ("f"), switch to comment state. 5. Comment state: consume anything else ("o"), stay in comment state. 6. Comment state: consume anything else ("o"), stay in comment state. 7. Comment state: parse error, consume EOF, return to data state. 8. Data state: end. No errors are emitted until the end of the string, as expected. Am I missing something? > It seems like the error is in the "Comment start dash state" (section > 8.2.4.19). It should switch to 'comment state' when a '-' is consumed, > which is not what it currently does. Given "<!--foo" you should never hit the comment start dash state. > One more thing I forgot to mention. Several of the states regarding > comments refer to outputting 'the comment token' and 'the comment > tokens data'. However there is no mention that I could find for when > the comment token is created. Maybe this isn't an error but a general > pattern? The comment token is created in the first paragraph of the markup declaration open state. HTH, -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 12 December 2008 17:15:16 UTC