- From: David Carlisle <davidc@nag.co.uk>
- Date: Wed, 11 Nov 2009 12:47:40 GMT
- To: public-html@w3.org
- Cc: public-xml-core-wg@w3.org
Henri wrote: > One can use <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML > 2.0//EN" "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd"> and > trade compat with WebKit and Opera into the ability to use the MathML > entities in shipped Gecko. (Here's a point where interop between > browsers is lacking, BTW.) Using an xhtml1+mathm2 dtd with an xhtml5+mathml3 document would work in a browser but would be confusing and fragile and break in other xml pipelines. Presumably compatibility with xml workflows would be a major reason for use of the xml serialisation of html5, so saying you have to use a doctype that makes the document invalid would seem pretty odd. > I'd expect it to map the public ids listed at > http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/src/nsExpatDriver.cpp#287 > to a bogo-DTD that defines either the XHTML 1.0 entities or the > *latest* MathML entity set (depending on which one of the two DTD > files in named in nsExpatDriver.cpp), and I'd expect it to map other > public ids and lone system ids to the empty stream. Personally I think that the spec should not mandate any particular entity resolver, so it's a fact of life with xml entities that some systems will report errors and some will read the dtd and use the definitions. Authors worried about that can use character data or numeric references instead (which is a good idea in any case). However I think the html5 spec could suggest (perhaps even an rfc .should. requirement) that a system using a non validating parser and the xml representation served over application/xhtml+xml (act as if it) uses a catalog that defaults a dtd if it were not there, and uses an entity resolver such that any external dtd is mapped to the same default dtd which could be essentially Public identifier: -//W3C//ENTITIES Combined Set//EN//XML System identifier: http://www.w3.org/2003/entities/2007/w3centities-f.ent which just looks like: <!ENTITY Aacgr "Ά" ><!--GREEK CAPITAL LETTER ALPHA WITH TONOS --> <!ENTITY aacgr "ά" ><!--GREEK SMALL LETTER ALPHA WITH TONOS --> <!ENTITY Aacute "Á" ><!--LATIN CAPITAL LETTER A WITH ACUTE --> <!ENTITY aacute "á" ><!--LATIN SMALL LETTER A WITH ACUTE --> <!ENTITY Abreve "Ă" ><!--LATIN CAPITAL LETTER A WITH BREVE --> <!ENTITY abreve "ă" ><!--LATIN SMALL LETTER A WITH BREVE --> <!ENTITY ac "∾" ><!--INVERTED LAZY S --> .... a sorted list of all the entities. Actually that file is a bit bigger than the xhtml+mathml set proposed for html 5 as it contains some ISO entity sets not normally included, but if it was thought useful a similar sorted list could be produced which just had the html5 entities. The format used there is mainly for human consumption, if there was any possibility of systems really fetching this over the web it could of course be compressed a lot by losing all the white space and comments, and using character data rather then numeric references for the replacements. David PS While I have the attention of the HTML and XML core WGs, just a heads up that we hope to be asking the xml entities draft to go to last call next week, and would again appreciate any reviews that the working groups, or individuals within those groups, could give to the spec, the current editors' draft version of which is always available at http://www.w3.org/2003/entities/2007doc/overview.html ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Wednesday, 11 November 2009 12:48:20 UTC