- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Fri, 22 Feb 2013 12:09:47 +0200
- To: "www-tag@w3.org List" <www-tag@w3.org>
On Fri, Feb 22, 2013 at 9:22 AM, Larry Masinter <masinter@adobe.com> wrote: > The implementors of Appendix C failed to implement it correctly. Who are you referring to? > Documents delivered as text/html should be parsed as HTML. This is what browsers do and have done after the episode from 2000 that I recounted. > Documents delivered as application/xhtml+xml should be parsed as XHTML/XML. This is what browsers do. >> In December of 2000, s/December/summer/ as noted earlier. >> before the release of Netscape 6, Gecko had an >> HTML parser mode called the Strict DTD. The "DTD" wasn't an SGML DTD. >> Instead, it was a C++ class that implemented the containment rules >> declared in the SGML DTD. Strict DTD threw away a markup that violated >> the HTML 4 Strict containment rules but didn't stop parsing up our >> error. > > This doesn't make sense. Why would they do such a thing. Supporting standards was in vogue and the big thing at Netscape relative to Microsoft at that time. I guess the "Strict DTD" was Rick Gessner's interpretation of supporting HTML4 and SGML. After all, HTML4 isn't clear on what should happen. > And what does > it have to do with XML anyway? The "Strict DTD" was used for XHTML-as-text/html for a short period of time in 2000. (It didn't make it to the Netscape 6 release, though.) >> See: >> https://groups.google.com/d/topic/netscape.public.mozilla.layout/7sdgGdjjZf >> U/discussion >> (The entire thread is an interesting read with the benefit of >> hindsight. You can see I was still an XHTML believer at that time.) >> >> The thread resulted in a telecon, where, among other things, it was decided: >> "- Parse XHTML delivered as text/html using the XML content sink with >> an HTML document. (Instead of using the Strict DTD, which we do >> today.)" > > This was a serious mistake. Text delivered as text/html should be > parsed as HTML. Right. The decision to parse XHTML-as-text/html as XML didn't last for long. >> That decision lasted for less than a month. IIRC, it was already too >> late to parse even the front page of O'Reilly's xml.com as XML. > > A publisher reacting to a widely distributed but mistaken browser > implementation isn't evidence of anything. What O'Reilly did arose as the natural consequence of the behavior you advocate: parsing text/html as HTML. It was evidence that it's not practical to parse text/html as XML. >> And so it has been ever since. Appendix C content wasn't transitioning >> anywhere. > > This wasn't the fault of Appendix C but of confusion about how to apply it. So if your position is that text/html must be parsed as HTML, how could Appendix C have transitioned to XML parsing if confusion had been absent? > I think polyglot is useful, but only if people don't try to second-guess what > is the publisher's responsibility to label content with a content-type that > is appropriate for parsing the content. I agree with you that text/html should be parsed as HTML, but I don't see how polyglot is useful if one parses text/html as HTML with a conforming HTML parser. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Friday, 22 February 2013 10:10:15 UTC