W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] Reading a start tag in "text" insertion mode

From: Mohammad Al Houssami (Alumni) <mha53@mail.aub.edu>
Date: Fri, 16 Aug 2013 09:43:43 +0000
To: Ian Hickson <ian@hixie.ch>
Message-ID: <0F8BA5A0576A5F44B0D188C2628265DA410A783B@DB3PRD0610MB380.eurprd06.prod.outlook.com>
Cc: "whatwg@whatwg.org" <whatwg@whatwg.org>
So this is what I am missing. My implementation does not follow the specs 100%. I have built the tokenizer completely first and now started the tree construction. I pass all the tokens so they are kind of separate until now. This is because of complexity reasons. The plan was to work on finding a way to go back to the Tokenizer after some progress is made. So basically I cant handle this situation at the moment. 
Thanks for the clear up Ian. :)

-----Original Message-----
From: Ian Hickson [mailto:ian@hixie.ch] 
Sent: Friday, August 16, 2013 4:24 AM
To: Mohammad Al Houssami (Alumni)
Cc: whatwg@whatwg.org
Subject: Re: [whatwg] Reading a start tag in "text" insertion mode

On Thu, 15 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> I am building a parser incrementally by sets of elements (and not all 
> at the same time ) so while debugging I noticed that the text 
> insertion mode does not have a "anything else" branch. Lets assume my 
> input is the
> following: <title><head> The title start tag will lead us to the text 
> insertion mode. And then what should happen ?  The specifications 
> don't deal with this case as there is nothing that says what should 
> happen in this case... I think I am missing something here ?

The generic RCDATA element parsing algorithm puts the tokenizer into the RCDATA state, from which the only possible tokens are text tokens, end tag tokens, and end-of-file tokens. These are the same tokens that the "text" 
mode handles.

So you parse a <title> start tag token, you go into "text" mode, then you get six character tokens, which get inserted into the <title> element, then you get an EOF token, and you unwind the parser and end.

What token are you getting that isn't handled?

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 16 August 2013 09:44:13 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:23 UTC