- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Fri, 18 Jan 2013 04:01:55 +0100
- To: whatwg@lists.whatwg.org
- Cc: David Carlisle <davidc@nag.co.uk>
David Carlisle on Fri, 18 Jan 2013 00:03:12 +0000: > To: Ian Hickson <ian@hixie.ch> > On 17/01/2013 23:31, Ian Hickson wrote: >> On Thu, 17 Jan 2013, David Carlisle wrote: >>>>> that documents will be interpreted differently by an XHTML >>>>> user agent and a standard XML toolchain. >>>> >>>> I do not understand what this means. Can you give an example? Though not XML, the trouble Anolis had with putting out the correct glyph values for the ⟩ and ⟨ entities, was caused by a part of Anolis that interpreted those entities in the old, HTML5 *in*compatible, way. This in turn resulted in the wrong character when the entities were converted to normal characters before being output to the HTML5 spec: https://www.w3.org/Bugs/Public/show_bug.cgi?id=14430 This was a surprisingly long lasting bug. (And perhaps not fully solved yet …) It had probably existed since HTML5 included named entities in the spec. And, as the reporter of the bug, I was asked time and again and again about whether the bug had been fixed or not ... In this case, Anolis outputted "polyglot" character references, since it converted the named reference to numeric references. (Please ignore HTML5's current shortcut: https://www.w3.org/Bugs/Public/show_bug.cgi?id=20702) But since the bug actually was in Anolis’ list of named character references, this nevertheless caused a misrepresentation of the named entities. >>> There is more to compatibility than compatibility between the >>> browsers. For XHTML there needs to be compatibility between >>> Browsers and XML tools (otherwise why use XML at all, I know you >>> would rather people didn't but so long as the spec allows then to >>> it should not mandate a situation that makes document corruption so >>> likely). >> >> There is no such mandate. The spec merely provides a catalogue of >> public identifiers and their modern meaning. Nothing stops XML users >> from using any other identifier, in particular SYSTEM identifiers. >> The spec discourages people from using DTDs in general, because of >> precisely the kinds of issues that are being discussed here, but the >> XML spec allows it, and that's what controls this at the end of the >> day (especially in the case of software that isn't using the HTML >> spec's catalogue). >> > As I note above there are many existing systems using the Public > identifiers of XHTML1 to refer to the XHTML1 DTD and using validating > parsers. They can not simply switch in a catalog that makes their > existing document collections invalid. So they can not make documents > using the XHTML1 public identifier load a DTD other than XHTML1 DTD. 1) If the legacy XHTML DTDs are so risky, shouldn't the spec explicitly warned against using them in authoring of XHTML5 documents? 2) David, have you considered the possibility of link this named entity magic to the legacy-compat variant of the HTML5 doctype? http://www.w3.org/TR/html5/syntax.html#doctype-legacy-string The advantage of doing so would be that nothing new needs to be introduced. The disadvantage (but perhaps advantage in Ian's eyes) ;-) would be the name of this doctype variant - "legacy". -- leif halvard silli
Received on Friday, 18 January 2013 03:02:23 UTC