W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > November 2009

Re: XHTML character entity support

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 03 Nov 2009 14:44:14 +0000
Cc: "Shelley Powers" <shelley.just@gmail.com>, "Geoffrey Sneddon" <gsneddon@opera.com>, "John Cowan" <cowan@ccil.org>, "public-xml-core-wg@w3.org" <public-xml-core-wg@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-Id: <CF84BB09-1717-49CA-9763-059168283DB3@iki.fi>
To: Simon Pieters <simonp@opera.com>
On Nov 3, 2009, at 16:32, Simon Pieters wrote:

>> Do you have a reference in the XML
>> specification that provides support for your contention?
>
> "If the entity is external, and the processor is not attempting to
> validate the XML document, the processor MAY, but need not, include
> the entity's replacement text. If a non-validating processor does
> not include the replacement text, it MUST inform the application
> that it recognized, but did not read, the entity."

And Opera then renders &, the entity name and ; in response to the XML
Processor informing it about the skipped entity.

> The point is then reiterated twice:
>
> "Note that non-validating processors are not obligated to read and
> process entity declarations occurring in parameter entities or in
> the external subset; for such documents, the rule that an entity
> must be declared is a well-formedness constraint only if
> standalone='yes'."
>
> "Certain well-formedness errors, specifically those that require
> reading external entities, may fail to be detected by a non-
> validating processor. Examples include the constraints entitled
> Entity Declared, ..."

What happens in Gecko is that the entity resolver feed expat a zero-
length stream. Hence, expat *thinks* it hasn't skipped any external
entity. Therefore, it halts due to the "Entity Declared" WFC, since it
can't claim not having processed the external entities.

Both the XML Processor in Opera and the XML Processor in Gecko do the
right thing per XML. The XML Processor in Gecko has been fooled into
processing a zero-length stream. The XML Processor in Opera knows it
has skipped an external entity.

One might argue that Gecko's entity resolver is bogus, but the XML
Processor isn't.

> On Tue, 03 Nov 2009 15:02:04 +0100, Shelley Powers
> <shelley.just@gmail.com > wrote:
>
>> Oops, again. Opera does generate an XML parsing failure when it comes
>> across an undefined entity when using the XHTML5 doctype.
>
> There is no "XHTML5 doctype". Any or no doctype can be used in
> XHTML5 and the spec does not give a preference.
>
> The rule only applies for entities that are declared in the external
> subset.

When no external entity is referenced, even an XML parser that skips
external entities knows it has skipped none. Therefore, it has to
report the reference to an undeclared entity and cannot appeal to
having skipped an external entity.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Tuesday, 3 November 2009 20:07:12 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:16:41 UTC