Re: XHTML and MIME from John Boyer on 2006-09-01 (public-appformats@w3.org from September 2006)

From: John Boyer <boyerj@ca.ibm.com>
Date: Fri, 1 Sep 2006 14:36:15 -0400
To: "L. David Baron" <dbaron@dbaron.org>
Cc: public-appformats@w3.org, public-appformats-request@w3.org, www-forms@w3.org
Message-ID: <OFAF671069.2342B5D3-ON882571DC.00638D94-882571DC.00663BB6@ca.ibm.com>
Responses to a few people:

JB: > 2) Why do you say "text/html is not XML"?

Lachlan:
Um.  Because it's not!  See earlier in the thread where it was mentioned 
that XHTML documents served as text/html are not treated as XML, but 
rather as any other erroneous HTML document, in tag-soup parsers.

JB: Exclamation is not explanation.  XHTML served as text/html are not 
treated as XML because your current code makes no effort to attempt that 
first.  In my earliest posts on this subject, I said that an application 
should lead with the attempt to parse XML, then follow with recovery 
strategies, or that it could try HTML first until it found "new features" 
then switch to an attempt to use XML.  The explanation for why not to do 
it this way has so far been "Cuz we don't wanna!'  On the technical side, 
Mark B has already shown it works, and Raman described an even smoother 
technique that would allow an even more graceful degradation.

As for being "erroneous" HTML documents here, I really thought we were 
talking about *graceful* *degradation* for legacy UAs.  As long as there 
is a modicum of grace, what happened to the idea that newer content could 
actually do some level of degrading.  The goal here is not to try to 
optimize the error cases to the point of perfection.   Moreover, with the 
appendix C guidelines for XHTML combined with making the important 
ease-of-authoring changes to XForms that *are* what we need to harvest 
from WF2, it becomes increasingly difficult to find things that aren't 
working well enough to be considered graceful degradation.

Anne: Partially because a pretty large community (as I perceive it anyway) 
considers that to be harmful. I also don't really see the point in doing 
failure recovery when parsing XHTML, except perhaps for debugging...

JB: Declaration isn't explanation either.  Why do you consider it harmful? 
 The problem here is that sometimes folks are advocating for relaxed 
graceful degradation and at other times rigid adherence to rules that have 
little justification other than preventing a useful migration from 
happening over time.  Also, the failure recovery is precisely to allow for 
graceful degradation in the context of this transition period. 

Elliote Harold: In a typical browser, yes. However I routinely download 
such pages with non-browser-tools based on XML parsers; and there the 
results are quite different. In these contexts, the XML-nature of these 
pages is very useful to me.

JB: +1, precisely my point about being able to grow the web over time in 
new and interesting ways. The enticement to XML well-formedness helps 
bring about new capabilities.  Others on this list have argued that 
slowness of adopting XML implies no demand.  No, the slowness is because 
people are inherently lazy and won't upgrade without a reason.  Give them 
reason, they upgrade to XML.  Use new features to encourage the XML and 
then you can give more reasons because the XML is there. 

L. David Baron: Quotes XML spec to say that once an error is detected, the 
XML processor must stop normal processing...

JB:  Please don't confuse an application with an XML processor. 
Applications consume XML processors and can do whatever the heck they want 
to with the result.  The XML processor MUST NOT continue normal processing 
because that is the guarantee that the application will be able to detect 
and respond to error scenarios.  For example, if an application applies an 
XML processor to a data stream, and the XML processor halts with a 
well-formedness error, the application can detect that and apply recovery 
strategies, like the one Raman suggested of attempting to convert the tag 
soup to well-formed XML.  Or, because we're talking about updated UAs and 
not legacy UAs, the updated UA could just report the error because the 
document developer exercising the new feature can *reasonably* be expected 
to try it out in an updated UA, not just a legacy UA.

Again, it's all about being reasonable, not rigid.

John M. Boyer, Ph.D.
Senior Product Architect/Research Scientist
Co-Chair, W3C XForms Working Group
Workplace, Portal and Collaboration Software
IBM Victoria Software Lab
E-Mail: boyerj@ca.ibm.com  http://www.ibm.com/software/

Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer





"L. David Baron" <dbaron@dbaron.org> 
Sent by: public-appformats-request@w3.org
09/01/2006 11:00 AM

To
public-appformats@w3.org, www-forms@w3.org
cc

Subject
Re: XHTML and MIME






On Friday 2006-09-01 10:03 -0700, T.V Raman wrote:
> It's always been a mystery to me as to why   people advocating
> tag-soup continuation assert "we can build a DOM from tag-soup"
> but then immediately insist on never doing failure recovery when
> parsing  xhtml.

Because one set of undocumented non-interoperable tag soup parsing rules
is more than enough?

If there were a spec defining exactly what errors were corrected and
how, then it would be much more reasonable.  Of course, there already is
such a spec -- XML 1.0 -- and it says [1]:

# Once a fatal error is detected, however, the processor MUST NOT
# continue normal processing (i.e., it MUST NOT continue to pass
# character data and information about the document's logical structure
# to the application in the normal way).

-David

[1] http://www.w3.org/TR/2006/REC-xml-20060816/#dt-fatal

-- 
L. David Baron                                <URL: http://dbaron.org/ >
           Technical Lead, Layout & CSS, Mozilla Corporation
Attachments

application/octet-stream attachment: attm4ni9.dat
Received on Friday, 1 September 2006 18:36:37 UTC