Re: errors and failures (was Re: Minutes of 27 May 2002 TAG teleconf) from Chris Lilley on 2002-05-29 (www-tag@w3.org from May 2002)

From: Chris Lilley <chris@w3.org>
Date: Wed, 29 May 2002 18:14:44 +0200
To: www-tag@w3.org, "Simon St.Laurent" <simonstl@simonstl.com>
Message-ID: <18611676078.20020529181444@w3.org>
On Tuesday, May 28, 2002, 5:36:49 PM, Simon wrote:

SSL> Perhaps I'm jumping the gun on this one, but there does appear to be a
SSL> substantial set of architectural issues surrounding error-handling on
SSL> the Web.

Agreed.

SSL> I find the TAG's uneasiness in addressing this issue deeply
SSL> troubling.

No, its an unease in addressing something which is not crisply stated
as an overall architectural issue. Error handling is rather a
case-by-case situation; what the TAG needs is a good problem statement
that is general rather than specific, and where we can issue a finding
that is universally architecturally applicable.

Its hard to see how any of the three possible solutions [1] could be
provided with a straight face if the general topic is "error recovery"
on all systems and all specs and all cases.

SSL>  Error-handling is often discussed as one of the best
SSL> features of the Web, permitting imperfect communications to take place
SSL> in a clearly-understood set of contexts.

Um. Depends on what the handling is.

SSL> HTTP's 404 Error -

Thats a status message, not an error.  Nor does it count as error recovery.

SSL> and others like it - were initially viewed with
SSL> disdain by closed-system hypertext people who were used to much more
SSL> tightly-controlled universes where they could prevent such errors. 
SSL> Nowadays I think we all recognize that humans are capable of dealing
SSL> with a 404 Not Found Error,

Yes, because it is an informative status message and not error recovery

SSL> On the HTML side, however, the lax approach to structure which initially
SSL> made it easy to get pages onto the Web is now strangling us

Well put.

SSL> as
SSL> developers try to do more with the Web.  Dynamic HTML and its brethren
SSL> were an early sign to one group of Web developers that more careful
SSL> coding was necessary.  Browser-war madness where vendors tried to keep
SSL> up with each other's idiosyncratic corner-case handling was another
SSL> consequence.

And probably started the trend where content development tools gave up
on providing their own preview or wysiwyg pane and instead embedded
one of said browsers as their preview, because the developer cost for
reverse engineering the corner cases (which are the majority) is
simply staggering. So authoring tools punt this work to the content
developers, who suffer in consequence.


SSL>  Perhaps the saddest consequence of all is the large number
SSL> of tools for working with HTML (as HTML, not just text) which still spit
SSL> out poorly-formed and not valid HTML with the understanding that it's
SSL> the browser's job to cope.

Yes.

SSL> XML seemed to signal a change in this approach,

The effects of this being immediately apparent in SVG authoring tools.
I am not aware of a single one that generates non-wellformed content.
And if it did, the viewers would immediately complain and halt on such
content. So far, no complaints and much praise for this approach from
the SVG content creation community.

SSL> taking much of HTTP's
SSL> "it's okay to show errors to users" instead of HTML's "do your best no
SSL> matter how potent the stench".


Grin.

SSL>  The XHTML specification seems to follow
SSL> the XML route, but implementations are still catching up.

The XHTML 1.0 and 1.1 specifications have the fatal error that they
still try to be compatible with the legacy browsers. Thus, they loose
both ways. On the one hand, they can't use all of XML because the
browsers will fall over on simple things like entity declarations in
an internal subset. And on the other hand, the single largest corpus
of stuff that looks like XML but isn't well formed is allegedly
"XHTML" pages.

The registration of application/xhtml+xml for XHTML 2.0 offers a
realistic way forward out of this mess.

SSL> Is error-handling an architectural issue for the Web?  I believe it
SSL> clearly is.

So please state it in the form of a question to which the TAG can
issue a finding. Are you talking of error reporting or error recovery,
for example?

SSL>  404 Not Found is a critical component of HTTP processing, a
SSL> key element of the Web's use of hypertext.  It doesn't always work, and
SSL> it's okay to admit it.  Web site developers have learned to cope with
SSL> this problem and haven't been screaming that browsers should just fix
SSL> it.

SSL> On the content side, it seems like it's (past) time to take a similar
SSL> approach.  Applications which purport to use XML are required by the XML
SSL> 1.0 spec to fail when presented with malformed XML.  (Validity is
SSL> treated separately.)  Should that foundation also be optional?

No. The error in XHTML 1.x was to allow XML content to be treated with
non-XML tools and somehow expect this to be useful. On the other hand,
given that XML unlike HTML has

- no universal fragment identifier syntax
- no universal linking and embedding syntax

its difficult to argue how that error could have been avoided. Maybe
if (simple, 1.0) versions of XLink and XPointer had been available in,
say, 1999 things would have been different. Currently, with generic
my-own-namespace XML and a PI to a CSS stylesheet, its possible using
a halfway decent implementation to get a static picture of a web page
where links in and out do not work. This is hardly a tempting target
for content developers.

SSL>  If it
SSL> is, the consequences for the Web architecture are pretty far-reaching,
SSL> removing much promise for machine-processing of information.  If that
SSL> foundation is optional for XHTML,

Its not optional, but there are issues when the content and the
processors are allowed to use different models.

SSL> should it be optional for SOAP?  RDF?

SSL> I'd strongly recommend that the W3C take a much harder-line on error
SSL> reporting for content than it did up to HTML 4.0, and that it bring
SSL> error-handling for content into the same general framework as
SSL> error-handling for transfer.

Sure. Already doing that, see for example
http://www.w3.org/TR/SVG/implnote.html#ErrorProcessing
http://www.w3.org/TR/SVG/conform.html

Its hard to generalize from the specific case (which is best handled
by the specific Working Group) to the general architectural case (best
handled by the TAG). But I invite Simon or Rob or anyone else who feels
motivated to try, to restate the issue in a way that the TAG can
sensibly expect to issue a finding in a meaningful timeframe.

SSL>  I'm sure that some of the membership will
SSL> be deeply unhappy, but the Web as a whole will be better able to advance
SSL> into new areas.

Well said.


[1]
1) Never recover from errors. Report them and die, always.
2) Tell the user, then recover from errors like this: ....
3) Recover from errors or not, and do it however you please,
   and don't document how you do it and don't tell the user.

-- 
 Chris                            mailto:chris@w3.org
Received on Wednesday, 29 May 2002 12:15:23 UTC