W3C home > Mailing lists > Public > public-html@w3.org > November 2008

Re: Parsing problem with misnested tags

From: Philip TAYLOR (Ret'd) <P.Taylor@Rhul.Ac.Uk>
Date: Wed, 12 Nov 2008 19:18:09 +0000
Message-ID: <491B2BF1.3030007@Rhul.Ac.Uk>
To: Philip Taylor <pjt47@cam.ac.uk>
CC: HTML WG <public-html@w3.org>

I am having difficulty wrapping my head around this :

Philip Taylor wrote:
> HTML5 (or at least html5lib and validator.nu) currently parses
>   A<code><pre>B</code></pre>C
> into
>   |     "A"
>   |     <code>
>   |       <pre>
>   |         "B"
>   |       "C"

I assume (as you haven't shewn them explicitly) that
there are no implied </...>s anywhere in that parse tree.
In which case I would certainly support your tacit
assertion that this behaviour is wrong.  A closure
for an outer element must surely close all inner
elements, whether or not the specification requires
that they be explicitly closed, as a normal part of
the parser's error recovery procedure.

However ...

> In particular, the "C" is inside the <code>. In browsers (IE6, FF3, 
> O9.6, S3.0) the "C" is outside the <code> instead.
> This significantly breaks 
> http://blogs.sun.com/bblfish/entry/rest_apis_must_be_hypertext 

I'm not sure what semantics you are ascribing to "significantly"
here : are you saying that http://blogs.sun.com/ is such a significant
site that even if it outputs crap code (which it clearly does),
browsers should bend over backwards to accommodate that crap code,
or are you not making any value judgement concerning http://blogs.sun.com/
but instead saying that html5lib and validator.nu both make a major
error in their handling of its aberrant output ?

(the other) Philip TAYLOR
Received on Wednesday, 12 November 2008 19:18:52 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:39 UTC