HTML 5 removed "numeric character reference" term - why?

HTML 5 redefines "character entity reference" in such a way as to unify the 
term with what used to be known in SGML and XML as numeric character 
references, reflecting a common conflation made by people who don't know what 
entity means.

To retain alignment with XML and 30 years of document processing, I suggest 
not inventing an HTML-5 specific definition of entity, and instead continue to 
use the correct and distinct terms:

"Character entity references" are a class of "entity references" -- references 
to named entities (a fairly well-defined concept in SGML and XML) -- where the 
entities consist of single characters.

"Numeric character references", in contrast, are references to characters that 
use explicit numeric code points rather than referring to named abstractions.

If the change to just call them all "character entity references" was 
deliberate, why not put a note of explanation in the spec?

Received on Monday, 18 June 2007 22:12:52 UTC