W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] Parsing the string <html>

From: Ian Hickson <ian@hixie.ch>
Date: Sat, 3 Aug 2013 02:20:41 +0000 (UTC)
To: "Mohammad Al Houssami (Alumni)" <mha53@mail.aub.edu>, "Tab Atkins Jr." <jackalmage@gmail.com>
Message-ID: <alpine.DEB.2.00.1308030216550.27623@ps20323.dreamhostps.com>
Cc: "whatwg@whatwg.org" <whatwg@whatwg.org>
Ian wrote:
> On Fri, 2 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> > 
> > When parsing the string <html> the document should supposedly have an 
> > html root with head and body children. ( This is what live dom viewer 
> > shows at least) but according to the specs( if im not wrong) we only 
> > get the document with the html element and the stack of open elements 
> > will have html head and body elements in it.
> 
> The "<html>" start tag token causes you to jump from the "initial" 
> insertion mode to the "before html" insertion mode, and then the <html> 
> element is created and you jump to "before head".
> 
> You then hit the "end of file" token, and that causes the <head> element 
> to be generated, and switches you to "in head", where <head> is popped 
> and you switch to "after head", where you insert a <body> element and 
> switch to "in body", at which point you stop parsing.

On Sat, 3 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
>
> That is totally correct. But are the head and body elements added to the 
> document? So basically when we stop parsing the document should only 
> have the html element is that correct?

On Fri, 2 Aug 2013, Tab Atkins Jr. wrote:
> 
> No, the spec clearly says "Insert an HTML element..." for those as you 
> trace through the parsing.

As Tab says, when the elements are generated they are also immediately 
inserted into the document. For example, where it says:

# Insert an HTML element for a "body" start tag token with no attributes.

...in the "after head" mode, "Insert an HTML element" is a hyperlink to 
the definition of that algorithm earlier in the spec, which says:

# 1. Let the adjusted insertion location be the appropriate place for 
#    inserting a node.

...which itself basically just boils down to "inside current node, after 
its last child (if any)", followed by:

# 2. Create an element for the token in the HTML namespace, with the 
#    intended parent being the element in which the adjusted insertion 
#    location finds itself.

...followed by (skipping bits irrelevant to this case):

# 4. If it is possible to insert an element at the adjusted insertion 
#    location, then insert the newly created element at the adjusted 
#    insertion location.

...which appends the <body> element to the <html> element (after the 
<head> element, which goes through the same process earlier). When you 
append a node to another, they end up in the same Document.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Saturday, 3 August 2013 02:21:06 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:23 UTC