- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 22 Aug 2004 11:04:37 +0000 (UTC)
I agree with everything in Henri's e-mail explaining why spec-mandated DOCTYPE-triggered mode switching makes no sense. On Sun, 1 Aug 2004, Henri Sivonen wrote: > > On Aug 1, 2004, at 06:10, Matthew Thomas wrote: > > > On 31 Jul, 2004, at 11:58 PM, Henri Sivonen wrote: > > > ... > > > First of all, the solution needs to apply to XHTML as well as HTML. If we > > > still assume XML is to be taken seriously (and not as tag soup), doctype > > > sniffing on the XML side is totally, utterly bogus. > > > > That's a presumptive definition of "seriously". > > The presumption is that if lower-level spec defines two things that are > equivalent, a higher-level spec should not try to give different meanings to > the two things. So I'm being presumptuous only in the sense that I think > layered spec design general best practice should be followed. > > In formal terms, if two XML documents have the same canonical form and an app > treats them differently (and the difference is not due to opting not to > process external entities), the app is broken, IMHO. > > In practical terms, if two XML documents cause the same content to be reported > (qnames ignored) to SAX2 ContentHandler and an app treats the documents > differently, the app is broken, IMHO. > > A spec that would explicitly or implicitly require an implementation to be > broken is itself broken. Indeed. And DOCTYPEs are basically optional -- in XML, everything has to be based off namespaces. It should also be possible to construct a DOM tree "by hand" using the DOM and have the exact same rendering as if the DOM tree was obtained by parsing a document. > > In the long run, it *may* be the case that treating XHTML as tag soup is the > > only "serious" way of doing it. > > WHAT WG should not try to push things to that direction. Indeed. And I disagree that treating XHTML as tag soup would ever be the right way to do it -- if that is what people want, they should use HTML, which is already at that stage. XML has well-defined parsing rules. There's no reason not to follow them. > > > The reason why it is bogus is that including a DTD by reference and > > > pasting it inline are supposed to be equivalent for validating XML > > > processor and in the latter case you don't see a public identifier for the > > > DTD. Hence, using the public identifier for any purpose other than > > > locating the DTD is just plain wrong. Of course, sane real-world XHTML > > > user agents use non-validating XML processors which makes the inclusion of > > > the doctype declaration rather pointless. > > > > So do any real-world XHTML UAs handle a DTD pasted inline, or is this just a > > theoretical argument? > > Mozilla processes the internal DTD subset, but that was not my point. UAs must, per XML, handle internal subsets. > My point was that if you have > #include "foo.h" > you should not bind any black magic to the name foo.h, because it should be > permissible to paste the contents of foo.h inline or copy the contents of > foo.h to bar.h and say > #include "bar.h" Indeed. > However, considering that as a Web author you cannot trust that everyone > parsing your pages uses an XML processor that resolves external entities, > including a doctype in XML intended for the Web is mostly pointless and often > done out of a cargo cultish habit. Hear hear. This is one of the many reasons that WHATWG specs actually subtly discourage the use of DOCTYPEs. > > > ... > > > Now, similar argumentation does not work on the HTML side if we agree not > > > to pretend that real SGML is being processed. Doctype sniffing is a tag > > > soup solution to a tag soup problem. > > > > That's an extrapolation from a single data point. The only use of doctype > > sniffing *so far* has been to handle quirky style/layout expectations of old > > pages (and in the case of table style inheritance, they wouldn't even need > > to be tag-soup pages). In the long run, doctype sniffing may become a > > general-purpose method of changing *any* undesired behavior (whether > > de-facto or de-jure) of old syntax in new spec versions. > > Doctype sniffing was devised after the HTML 4 and CSS2 specs had been written > as a heuristic to distinguish legacy documents from documents whose authors > might expect conforming behavior. > > The circumstances and requirements that led to doctype sniffing were different > from the circumstances and requirements for specs that have not yet been > finalized. With WF2 there is no need to come up with an extension to an old > heuristic. Now that the issue has been raised in the speccing phase we can > have a more explicit incantation. For example: <meta > name="mpt-approved-radio-buttons" content="true"> or <meta > name="what-wg-behavior" content="do-the-right-thing"> Exactly. DOCTYPE sniffing was always meant to be a heuristic; a way of detecting whether the page author had written the page before or after browsers started seriously looking at spec compliance. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 22 August 2004 04:04:37 UTC