Re: Sudden death: request for missing input from David Durand on 1997-05-01 (w3c-sgml-wg@w3.org from May 1997)

From: David Durand <dgd@cs.bu.edu>
Date: Thu, 1 May 1997 14:51:26 -0500
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <v03007800af8ea220b0c4@[205.181.197.90]>
>Neither <foo bar=baz> or <a><b></a> are XML; XML is simple enough that
>there is a good probability that these, rather than author error, are the
>result of a broken communication link or output filter.  The proper
>response to such breakage is prompt termination without extreme prejudice
>but with a clear error signal.
Neither of these is a likely error with TCP. You can get premature
termination of TCP connections, but dropped data in the middle of a stream.

It could be a broken script, but in that case it's merely another case of
author error (the script author and not the document author).

>I do not want us lurching over the slippery slope where every little
>formerly-lightweight piece of useful XML client code is loaded with
>bloated guess-what-the-author-really-meant heuristics.  Empirical
>evidence would suggest that the danger of this is real.

The only convincing cases where draconian strategy is essential (EFT,
general use of XML as a program->program communication mode) -- are all
cases where a custom client is required. Such a client can _always_ refuse
to accept malformed data. I might be able to hand-type input to an SMTP
server, but if I make a typi it's just rejected.

>
>Someone a few messages back proposed a policy where a browser
>has to maintain a continuous this-document-is-dogshit display
>in the presence of non-well-formed instance.  I can't at the
>moment see how to engineer the spec to achieve such a constraint;
>if we could, I suppose this would be tolerable.  At a *very bare
>minimum*, we must remove the phrase "at user option" from the
>definition of "reportable error" in section 1.3.  I can see no
>scenario in which it is ever desirable to suppress a
>well-formedness error message.

I can't recall anyone who has argued against the draconian strategy who has
not grated that deletion. It should be _required_ that parsers report
detected errors of well-formedness. It can even be _recommended_ that error
_correction_ shall only be available at user-option (treating XML consuming
applications as users).

It is sad, but true, that requiring bad error handling (which is the
draconian strategy) will make editors, import filters, and simple
displaying browsers look bad. We're not going to lure developers by
requiring them to make themselves look bad. We can require them to make
propagators of incorrect XML "look bad" without forcing them to hurt the
consumers of bad data _when it is not essential_ to hurt those consumers.

>We went to a lot of work to make well-formedness easy.  It is a very
>low bar to get over... much easier than producing valid HTML.  I
>cannot for the life of me see why so many people here are willing to
>tolerate gross error, and run the risk of another race-to-the-bottom a
>la HTML, when the standard required to achieve reliable interoperability
>is so easy to explain and to achieve.

  The race to the bottom can be prevented several ways. One way is simply
that XML is simply more useful if it's correct -- and people can always
fall back to HTML if they don't care.

   Why is specifying mandatory error notification harder to enforce than
specifying mandatory refusal to process erroneous documents?

    -- David (who doesn't want to race to the bottom, but feels that trying
to balance on a pedestal may be too difficult...)

_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________
Received on Thursday, 1 May 1997 16:52:01 UTC