Re: During HTML parsing, are *all* named character references replaced by their corresponding glyph? from Jukka K. Korpela on 2013-06-24 (public-html@w3.org from June 2013)

From: Jukka K. Korpela <jukka.k.korpela@kolumbus.fi>
Date: Mon, 24 Jun 2013 22:57:19 +0300
To: HTMLWG WG <public-html@w3.org>
CC: David Carlisle <davidc@nag.co.uk>, Šime Vidas <sime.vidas@gmail.com>
Message-ID: <51C8A49F.5050007@kolumbus.fi>

cor
2013-06-24 14:34, Michael[tm] Smith wrote:

> David Carlisle <davidc@nag.co.uk>, 2013-06-24 10:14 +0100:
>
>> On 24/06/2013 07:04, Michael[tm] Smith wrote:
>>> True for all other elements but for <script>&amp;</script> and
>>> <style>&amp;</style>
>> How could you forget <xmp>&amp;</xmp> :-)
> The same way I also forgot <iframe>&amp;</iframe> &
> <noembed>&amp;</noembed> & <noframes>&amp;</noframes> &
> <noscript>&amp;</noscript> :)
>
I might be missing the intended meaning of the smileys, but it seems to 
me that script, style, and xmp elements have special parsing rules 
whereas iframe, noembed, noframes, and noscript don’t. Thus, only about 
the first can you say that &amp; is not converted to & in parsing and 
therefore replaced by the corresponding character (not glyph).

However, in those contexts &amp; is not a character reference (or, to 
use old HTML terminology, an entity reference) but just a sequence of 
characters. The special parsing rules are an exception to interpreting 
certain strings as “named character references”. Thus, it would still be 
correct to say that *all* such references are replaced by corresponding 
characters.

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/

Received on Monday, 24 June 2013 19:57:45 UTC