W3C home > Mailing lists > Public > public-html@w3.org > November 2009

Re: XHTML character entity support

From: John Cowan <cowan@ccil.org>
Date: Tue, 3 Nov 2009 14:44:51 -0500
To: Boris Zbarsky <bzbarsky@MIT.EDU>
Cc: Shelley Powers <shelley.just@gmail.com>, Henri Sivonen <hsivonen@iki.fi>, Simon Pieters <simonp@opera.com>, Geoffrey Sneddon <gsneddon@opera.com>, John Cowan <cowan@ccil.org>, "public-xml-core-wg@w3.org" <public-xml-core-wg@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <20091103194451.GA14232@mercury.ccil.org>
Boris Zbarsky scripsit:

> As I understand it, XML core allows several different parser behaviors 
> here (ranging from "report a well-formedness error" to "load the DTD and 
> expand entities using it" to "use a local catalog" to "just shows the 
> unexpanded entities as text").  I could be wrong in this understanding, 
> of course; please correct me if I am.

Well, it's not conformant for an XML parser to just report the entity
reference as text, but it can report it as an unknown entity reference,
and the parser's caller can then convert it to text.

(TagSoup, my SAX HTML parser, does return unknown entities as text,
but it deliberately doesn't conform to the XML recommendation.)

> If I understand correctly, browsers have by and large chosen a 
> particular behavior: using a local catalog for particular DTDs.  The 
> suggestion is to define that those particular DTDs should use a specific 
> local catalog and what that local catalog is.
> 
> Since the DTDs involved are the various XHTML DTDs, it seems that this 
> group might be the one tasked with such definition.

Indeed.

> >Besides, the point is moot: XHTML5 does not have a DTD, only the five
> >predefined works with the XHTML.
> 
> Agreed; this discussion isn't about XHTML5 per se.

But that does not have to be so, if the HTML5 group decides otherwise.
A DTD could be provided, and if it had a standard public or system
identifier, standard XML catalog software could be used to cache the DTD
(or indeed certain DTDs could be hardwired).  This would be a particular
in XHTML software, not something general to all XML parsers.

-- 
Clear?  Huh!  Why a four-year-old child         John Cowan
could understand this report.  Run out          cowan@ccil.org
and find me a four-year-old child.  I           http://www.ccil.org/~cowan
can't make head or tail out of it.
        --Rufus T. Firefly on government reports
Received on Tuesday, 3 November 2009 19:45:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:51 GMT