- From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
- Date: Sun, 03 Sep 2006 11:26:20 +1000
- To: John Boyer <boyerj@ca.ibm.com>
- CC: public-appformats@w3.org, www-forms@w3.org
John Boyer wrote: > Responses to a few people: > > JB: > 2) Why do you say "text/html is not XML"? > > Lachlan: > Um. Because it's not! See earlier in the thread where it was mentioned > that XHTML documents served as text/html are not treated as XML, but > rather as any other erroneous HTML document, in tag-soup parsers. > > JB: Exclamation is not explanation. XHTML served as text/html are not > treated as XML because your current code makes no effort to attempt that > first. In my earliest posts on this subject, I said that an application > should lead with the attempt to parse XML, then follow with recovery > strategies, or that it could try HTML first until it found "new features" > then switch to an attempt to use XML. I get the feeling you're basing this and other arguments on the fallacy that text/html can be treated as XML because RFC 2854 (or any other spec) doesn't explicitly define it as not being XML and because XHTML 1.0 can be *compatible* with HTML4 browsers. It's exactly like saying text/plain can be treated as HTML because it isn't explicitly defined as not being HTML, and HTML source code can be sent as text/plain. Unfortunately, IE does exactly that and I'm sure you're aware of the mess that has caused! Content sniffing for (X)HTML is not defined anywhere and it is not endorsed by the HTML WG, who have previously stated that XHTML as text/html should be treated as HTML; and major browser vendors have already decided that they cannot and will not implement such a feature now. Any solution developed *must not* ignore the major desktop browser vendors. To do so would only further divide the two camps and result in another specification from you that will be ignored by both browser vendors and authors, making it effectively useless in the real world. The solution must also be compatible with the current state of the web. If mainstream browsers did actually attempt what you suggest by parsing real-world text/html content as XML and switching to HTML at the first well-formedness error, then in approximately 99.9% of all cases (if not more!), the browser would simply be wasting time with an XML parser when it's just going to end up using the tag-soup parser anyway. Given that, and the other technical reasons given by Anne and Henri, it's time to give up the idea that text/html content can be treated as XML in the real world and move on. > The explanation for why not to do it this way has so far been "Cuz we > don't wanna!' No, my arguments have been based on technical reasons, specs and evidence of real-world authoring habits. > On the technical side, Mark B has already shown it works, and Raman > described an even smoother technique that would allow an even more > graceful degradation. Unfortunately, the "it works for me" argument they've given simply doesn't hold up. It is you, and the others in your camp, that have been arguing "Cuz we can" and presenting evidence that relies on *undefined* handling of XML in text/html. > Anne: Partially because a pretty large community (as I perceive it anyway) > considers that to be harmful. I also don't really see the point in doing > failure recovery when parsing XHTML, except perhaps for debugging... > > JB: Declaration isn't explanation either. Why do you consider it harmful? One simply has to look at real-world evidence of XHTML served as text/html to see how many fatal mistakes are made by millions of authors, which would make any real switch to XML incredibly painful. Such mistakes include (among others) well-formedness errors, character encoding issues, and scripts and stylesheets relying on HTML handling. > The problem here is that sometimes folks are advocating for relaxed > graceful degradation and at other times rigid adherence to rules... Graceful degradation and adherence to the rules are not mutually exclusive goals. There is no problem here. > ...that have little justification other than preventing a useful > migration from happening over time. There has been plenty of reasons given and none of them have anything to do with preventing migration to XML. > Elliote Harold: In a typical browser, yes. However I routinely download > such pages with non-browser-tools based on XML parsers; and there the > results are quite different. In these contexts, the XML-nature of these > pages is very useful to me. See above. We cannot ignore typical browsers when developing a solution. Also, what you do with a file once you've downloaded it for offline use is up to you. There are no interoperability concerns with that and is not relevant to this discussion. -- Lachlan Hunt http://lachy.id.au/
Received on Sunday, 3 September 2006 01:26:35 UTC