Re: Error handling: yes, I did mean it

Summary:  we cannot in practice require that XML processors ignore or
discard data following the first detected error; as a result we should
not try to do so, even if doing so were a good idea (which it is not).

Tim has suggested, and a number of people have supported the idea, that
after detecting a violation of a well-formedness constraint an XML
processor be required to stop sending information to the downstream
application.  A number of people have already argued against this idea,
using arguments I agree with and won't repeat.  Here I just want to
point out that (with a single exception) neither Tim nor anyone else has
made any argument that, even if taken at face value, would lead to the
conclusion that this is a good idea.

The sole exception is Tim's rhetorical question "can any application
hope to do anything useful with ill-formed data?" to which the only
realistic answer is 'Yes, of course, many applications do hope to do
useful things with ill-formed data, and some of them are right.' James
gave this answer, and no one has attempted any refutation, so I won't
say any more about it.

Otherwise, the arguments of the Draconian camp are all centered around
the unquestioned observations that
  - there are applications where ill-formed data is useless or worse
than useless, and where ill-formedness must be detected
  - by their unwillingness to issue error messages, and their
determination to provide attractive displays even of badly ill-formed
documents, HTML browser makers have made their own lives very difficult

Neither of these observations supports a blanket ban on error recovery by
XML processors.

Tim and others have, in the meantime, conceded that some applications
can usefully attempt error recovery, and hope to salvage the Draconian
Rule by suggesting that such applications should use programs which
aren't 'XML procesors' in the strict sense.  This amounts to saying
"implementors can pick and choose which parts of XML to implement, and
can keep themselves blameless even when flouting basic requirements of
the spec, if only they call themselves XML Handlers or some other name
instead of 'XML processors'".  I cannot think of a worse approach to
the problem of ensuring uniform error reporting by XML software.

Whether it's possible to prevent vendors from attempting to compete on
the basis of the quality of their error recovery, I don't know.  I doubt
it.  I also don't see why it's necessary to prevent it.  It's not the
error recovery in HTML browsers that has led HTML to its current state,
but the *silence* of that error recovery.  We complain that most authors
validate by looking at the document onscreen -- what else do we want?  I
do that myself, in SGML.  Yes, I do check the return code from the
parser, but I also check to see that everything looks all right -- if it
doesn't, the validity of the document is deceptively hiding errors in
the tagging or in the style sheet.  The only thing wrong with checking
by visual inspection is that in most HTML browsers it's not a sufficient
check.  An author who does want to find errors can't do so with the
software at hand, because the browser won't report them.

So I agree with whoever it was who said that the real problem is the
absence of an error-reporting mode in HTML browsers.

If this is true, then what we need to do is to ensure that XML
processors *always* allow the user to request error reports, even if the
software recovers from the errors in question.  That way, the user who
says "program X displays my data all right, why don't you?" can be told
"look, even program X says your document is ill-formed: look at it with
error-checking turned on!"

As it happens, the xml-lang spec already requires this.  I don't think
it can realistically or usefully require more, except perhaps that it
should also explicitly require that the processor notify any down-stream
app, as well as the 'end' user if any.  I don't think it should require
less.

If we want the culture of XML usage to differ from that of HTML, we need
to ensure that implementors pay attention to the requirement that they
report reportable errors unless the user says not to.  We can do that by
complaining unmercifully about any implementation that doesn't provide
error reporting, and by pointing out -- correctly -- that it's not a
conforming implementation of XML.

-C. M. Sperberg-McQueen

Received on Monday, 28 April 1997 05:05:07 UTC