During HTML parsing, are *all* named character references replaced by their corresponding glyph? from Šime Vidas on 2013-06-22 (public-html@w3.org from June 2013)

From: Šime Vidas <sime.vidas@gmail.com>
Date: Sun, 23 Jun 2013 00:09:15 +0200
To: public-html@w3.org
Message-ID: <CAF=ZmuzK6oUD+-P5CnLszLrJVJ=ZJaB=Z_CnxyvkjF7sbayuqA@mail.gmail.com>

(I apologize if this is OT for this mailing list.)

>From what I understand, named character references, e.g. &amp;, only exist
in HTML source code, and once the source code is parsed into the DOM, *all*
named character entities are replaced by their corresponding glyphs. There
is no exception to this rule.

For instance, this source code:

<span>&amp;</span>

will produce a DOM element (of type "span") which contains a single Text
node which then in turn contains the text value "&". So, not the entity
&amp; but the actual "&" literal character.

Could you confirm that the above is correct?

-- @simevidas

Received on Saturday, 22 June 2013 22:09:41 UTC