W3C home > Mailing lists > Public > public-html-xml@w3.org > December 2011

Re: HTML/XML TF Report republished (use of entity references in XHTML)

From: David Carlisle <davidc@nag.co.uk>
Date: Thu, 22 Dec 2011 19:22:00 +0000
Message-ID: <4EF38358.1060902@nag.co.uk>
To: public-html-xml@w3.org
Thanks Norm,

One issue that has come to light since that was drafted is recorded in
bug 13409

https://www.w3.org/Bugs/Public/show_bug.cgi?id=13409

which was closed as wontfix, but I just re-opened.

I wonder if members of the TF may like to comment on that. Basically, as
currently defined it is impossible to have a valid XML file that uses
entity references and html5 features.

To use an entity reference you need to specify an old xhtml1 or MathML2
PublicID, see:

http://www.w3.org/TR/2011/WD-html5-20110525/the-xhtml-syntax.html#parsing-xhtml-documents

(unchanged in later drafts)

However if you specify one of those PUBLIC ID and process the file in an
XML toolchain with anything approaching a standard catalog setup, then
either the file will be not well formed (if say you use &_rightarrow;
but specify an XHTML 1 dtd) or well formed but suffer data corruption
(if you use &_rang; but specify any of the listed DTD when you will get
the old (not in normal form C) value rather than the value specified by
the current spec. Plus of course, if you use any html5 feature such as
canvas, the file will be invalid (although that's arguably less important).

If you specify a DTD that does define the same entity definitions as
would be used by the html parser, say

"-//W3C//ENTITIES HTML MathML Set//EN//XML"
"http://www.w3.org/2003/entities/2007/htmlmathml-f.ent"

Then the xml parser, as specified on the html spec, will not load the
external dtd subset, so any use of entities is a fatal error.



David
Received on Thursday, 22 December 2011 19:22:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 22 December 2011 19:22:34 GMT