- From: Maciej Stachowiak <mjs@apple.com>
- Date: Sun, 01 Nov 2009 13:15:15 -0800
- To: Shelley Powers <shelley.just@gmail.com>
- Cc: Boris Zbarsky <bzbarsky@mit.edu>, Alexey Proskuryakov <ap@webkit.org>, HTML WG <public-html@w3.org>
- Message-id: <BE70AEF8-AAF3-4D3C-A066-B1BB9D8E6EC0@apple.com>
On Nov 1, 2009, at 6:13 AM, Shelley Powers wrote: > > This isn't a case of "breaking" the web: the specifications are clear > in how named entities are handled. There are five predefined entities > for XML, and several for HTML4 based on the HTML4 DTD. The addition of > new named entities in XML is based on the use of DTDs, whether > external or internal. There are 253 in total for XHTML based on DTDs, > but only five of these are available to XML parsers that don't read > external DTDs. XML Parsers do not have to read the external DTD. Clarity of the specifications doesn't mean you can do what they say without breaking the web. The specifications say it's your choice whether to support entities from the XHTML DTD or not, but in practice content relies on browsers doing so (in part because DTD-based validators said it was ok). So there's no real choice. > If we change the document to allow additional named entities into > XHTML5, existing XML parsers that read DTDs (validating parsers) will > end up throwing errors when encountering an XHTML5 document that has > anything other than the five predefined entities. They will have to be > edited to "special case" XHTML5, just because XHTML5 is no longer well > formed XML. The above wouldn't apply to documents with no doctype declaration, only ones with an XHTML 1.0 DTD. I believe I explained this in another message. (However, use of undeclared entities does not make an XML document fail to be well-formed). > There was never an *issue of consistency before, because even though > the browsers are not validating parsers, the doctypes they hard coded > do have support for named entities, and therefore they are 'emulating' > validating parsers. There is no inconsistent result between the true > validating parser, and the faux validating parser (at least in this > context). [...] > > But there is no DTD for HTML5[1]. Not even the XHTML version. Either > we'll have inconsistent results (and errors) if people use named > entities, or every validating XML parser and parser library in the > world that potentially will need to parse XHTML5 will need to be > modified to adapt to the W3C's implementing a policy to deliberately > create malformed XML. This makes me think you have a different understanding of the request than I do. Here is the rule I think should be specified: * Rule A: "XML documents that start with the XHTML 1.0 doctype or XHTML 1.1 doctype should always be parsed with the XHTML 1.x set of entities by an HTML5 UA, even if it is not otherwise a validating XML processor." You seem to be arguing against a rule like this: * Rule B: "XML documents that have no doctype declaration should always be parsed with the XHTML 1.0 set of entities by an HTML5 UA, even though they are not declared anywhere." I don't believe anyone is arguing in favor of Rule B (though I could be wrong). Do you have a problem with Rule A? Regards, Maciej
Received on Sunday, 1 November 2009 21:15:50 UTC