W3C home > Mailing lists > Public > public-html@w3.org > July 2008

Re: [author-guide] Character Entity References Chart

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Tue, 22 Jul 2008 04:32:25 +0200
Message-ID: <488546B9.3080408@lachy.id.au>
To: Karl Dubost <karl@w3.org>
Cc: public-html WG <public-html@w3.org>

Karl Dubost wrote:
> Le 21 juil. 2008 à 22:32, Lachlan Hunt a écrit :
>> XML requires validating parsers to be used in order to use entity 
>> references other than those 5 predefined by XML.
> no.
> The exact text of XML specification is as follow:
>     Note that non-validating processors are not obligated
>     to read and process entity declarations occurring
>     in parameter entities or in the external subset;
>     for such documents, the rule that an entity must be declared
>     is a well-formedness constraint only if standalone='yes'.

This basically says that non-validating parsers can read them if they 
like, but in practice they don't and won't do so.  Browsers workaround 
this issue for XHTML1 and MathML because they recognise the FPI in the 
DOCTYPE for XHTML 1.x and MathML and load the appropriate named entity 
references by cheating a little bit.  e.g. Firefox achieves this by 
parsing xhtml11.dtd and mathml.dtd located in resource:///res/dtd/, 
which only contain the ENTITY declarations.

>> Without an official DTD, no other references can be reliably used in 
>> XHTML 5.  Even if you provide your own custom DTD and DOCTYPE, most 
>> browsers don't use validating parsers and so won't be able to 
>> dereference the entity references.
> That is a false assumption. Browsers *can*, if they implement it, 
> deference the *named* entity references.

Just because they theoretically could doesn't make my statement false 
because browser *don't* implement it like that, and they won't.

> Your work, Lachlan, could be the start of a nice test suite to 
> specifically do an implementation report for each user agents and see 
> which named entities are supported in text/html AND application/xhtml+xml.

Here is a test for every named character reference in HTML5 for 
text/html only.  This includes the non-conforming legacy character 
references without the trailing semi-colon as well.


I will try and make a separate set of tests to make sure that none of 
the non-legacy entity refs work without the semi-colon later.

It is not worth testing these in application/xhtml+xml because there is 
no XHTML5 DTD, and so only the predefined ones will work.

Lachlan Hunt - Opera Software
Received on Tuesday, 22 July 2008 02:36:35 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:34 UTC