[Bug 13409] Defining Entity references for characters in XHTML. from bugzilla@jessica.w3.org on 2012-10-16 (public-html-bugzilla@w3.org from October 2012)

From: <bugzilla@jessica.w3.org>
Date: Tue, 16 Oct 2012 20:55:24 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-13409-2486-KhNuM02aNJ@http.www.w3.org/Bugs/Public/>
https://www.w3.org/Bugs/Public/show_bug.cgi?id=13409

--- Comment #24 from David Carlisle <davidc@nag.co.uk> ---
(In reply to comment #23)
> Test case:
> 
> http://intertwingly.net/tmp/bug13409.xhtml
> 
> Current status is that the latest release of Opera will resolve the &sect;
> entity reference, and that the latest releases of IE, Firefox, Safari, and
> Chrome will not.

So clearly the failing ones are following the spec as written currently.

I can't tell from here whether Opera is just loading the entities from
somewhere or (as I suspect) falling back to html parsing if xml parsing fails.
Either way isn't conforming with the current draft.

That document (despite my comment that you quote below) is however not a
particularly good example of the issue at hand as it is not a well formed XML
document and would give a fatal parse error if given to a standard xml parser
as the entity is not defined.

Better is the example of the MathML2 spec give in comment 3 that is a well
formed XML document and was parsed as such by all relevant browsers at the time
(IE, Firefox, netscape) as being the only ones of that era with any mathml
support. IE loaded the DTD specified and gecko at the time loaded its own
entity definitions given a <!DOCTYPE of that form. The browsers changed to
match an html5 draft thus _breaking_ existing content.


> 
> Given this, do we have any reason to believe that if the spec were to change
> in the manner described by this bug report that this behavior will meet the
> published exit criteria for the HTML WG:
> 
> http://dev.w3.org/html5/decision-policy/public-permissive-exit-criteria.html

Given that there is a list of PUBLIC identifiers that trigger entity loading
adding one more is essentially a trivial implementation change: I would have no
reason to suppose it wouldn't be implemented if specified but if it goes to the
wider WG implementers (rather than just editors) would have the chance to say.

> 
> The reason why I ask is that adding it now just to remove it later is not a
> good use of our collective time.
> 
> --------------------------------------
> 
> Chair hat off:
> 
> re: "it would be preferable if the html5 entity definitions were also loaded
> for the standard HTML5 doctype declaration <!DOCTYPE html> or all doctypes"
> 
> doctypes like application/xhtml+xml or application/xml should be
> processesable via standard XML toolchains.  This would not be the case if
> the proposal were adopted.  Example:
> 

"processable" in that context would mean specifying an xml catalogue that
defaulted the dtd. That isn't the default behaviour but most xml parsers may be
configured to use a catalog. However despite my earlier preference I don't want
to argue for that version now (as I suspect that it is harder to get consensus
for it) I just argue that an additional PUBLIC identifier be added to the list
at 

http://dev.w3.org/html5/spec/single-page.html#parsing-xhtml-documents

the public identifier should identify a DTD that defines the HTML5 entity set
(none of the existing ones in the list do that) the PUBLIC identifier

-//W3C//ENTITIES Combined Set//EN//XML

used on 

http://www.w3.org/2003/entities/2007/w3centities-f.ent

would be one choice but any PUBLIC identifier that does not already identify an
incompatible set of definitions would do.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Tuesday, 16 October 2012 20:55:27 UTC