W3C home > Mailing lists > Public > public-html@w3.org > December 2008

Re: Parsing problem with misnested tags

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 2 Dec 2008 04:02:56 +0000 (UTC)
To: Philip Taylor <pjt47@cam.ac.uk>
Cc: HTML WG <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0812020400400.17414@hixie.dreamhostps.com>

On Tue, 11 Nov 2008, Philip Taylor wrote:
> 
> HTML5 (or at least html5lib and validator.nu) currently parses
> 
>   A<code><pre>B</code></pre>C
> 
> into
> 
>   |     "A"
>   |     <code>
>   |       <pre>
>   |         "B"
>   |       "C"
> 
> In particular, the "C" is inside the <code>. In browsers (IE6, FF3, O9.6,
> S3.0) the "C" is outside the <code> instead.
> 
> This significantly breaks
> http://blogs.sun.com/bblfish/entry/rest_apis_must_be_hypertext which currently
> says
> 
>   <p>... we have the following JSON representation for a Person.</p>
>   <code><pre>
>   {  ... }
>   </code></pre>
>   <p>Note that ...</p>
> 
> since it results in the last paragraph having a monospace font when it 
> shouldn't.

It is intentional that <code> isn't in the list of formatting elements; 
the goal is to have the absolute minimum number of tags be processed this 
way (primarily for performance reasons but also to reduce the amount of 
badness on the Web).

Since <code> is causing problems, I've added it to the list of formatting 
elements. However, the same problem exists for _any_ element that isn't 
mentioned explicitly in the parser.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 2 December 2008 04:03:32 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:59 UTC