[whatwg] Tag Soup: Blocks-in-inlines from Simon Pieters on 2006-01-25 (public-whatwg-archive@w3.org from January 2006)

From: Simon Pieters <zcorpan@hotmail.com>
Date: Wed, 25 Jan 2006 12:50:30 +0000
Message-ID: <BAY109-F147E0ECD2DC32B51B12302B4120@phx.gbl>

Hi,

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
>However, there may be a 5th option available.  Consider this, using the 
>following markup samples from the article.
>
>1.
><em><p>X</em>Y</p>
>
>BODY
>   + P
>     + EM
>       + #text: X
>     + #text: Y

Why would you drop the first EM? Why should this be parsed any different 
than 4? I think it should look like this instead:

BODY
   + EM
   + P
     + EM
       + #text: X
     + #text: Y

>2.
><em><p>XY</p></em>
>
>BODY
>   + P
>     + EM
>       + #text: X
>       + #text: Y

Again, I think that there should be an empty EM before the P. Why are there 
two text nodes?

BODY
   + EM
   + P
     + EM
       + #text: XY

>3.
><em><p>X</p><p>Y</p></em>
>
>BODY
>   + P
>     + EM
>       + #text: X
>   + P
>     + EM
>       + #text: Y

BODY
   + EM
   + P
     + EM
       + #text: X
   + P
     + EM
       + #text: Y

>4.
><em>X<p>Y</em>Z</p>
>
>BODY
>   + EM
>     + #text: X
>   + P
>     + EM
>       + #text: Y
>     + #text: Z

Agree.

I don't think there's much advantage of differentiating between 
"well-formed" and "malformed" markup. They should be parsed the same to keep 
things simple and predictable. Thus, <em><p>XY</p></em> should be parsed as:

BODY
   + EM
   + P
     + EM
       + #text: XY

...IMHO.

Regards,
Simon Pieters

Received on Wednesday, 25 January 2006 04:50:30 UTC