Re: Betting our lives on error handling

Len Bullard <cbullard@hiwaay.net> writes:
> On the other hand, when a human is on the receiving end and knows 
> SGML, it is an easy class of errors to debug.  Just like C, after the 
> first bug or so, most of it goes away and it comes down to a simple 
> class of syntax anomalies usually due to sloppy input in the 
> ASCII editor.  Most SGML editors I've used when a competent 
> DTD designer created the schema are remarkably good at keeping 
> the markup at the level of precision required by the next handling 
> application. 

[No executive summary. This is simply some Friday, before-I-give-up-and-
go-home-for-the-weekend ramblings. Ignore if you're still caffeine-burning
busy...]

I'm now going to ask some rhetorical questions that I hope indicate that
the whole issue of application error handling is beyond the scope of a
syntax specification for a markup meta-language.

The whole issue of good DTD design in XML brings up two classes of errors
related to XML documents with DTDs. What happens when:

  Q1. the error occurs ten lines into the DTD, but the instance is WF?
   
  Q2. the error occurs in the document instance, after the DTD 
      has passed muster?

which brings up another question:      

  Q3. if the answer is to treat the instance in Q1 as WF, how does the
      processor communicate to the application the necessary information
      to ignore the DTD and only display the instance? Or do we shove the
      DTD in the user's face too? Note that absent the DTD, we probably
      can't apply the style sheet, at least reliably. Raw text of DTD
      in user's face. Sour looks. Or they don't get the DTD, if the
      application (minus the XML processor) can differentiate the two.
      My impression of an XML application is that it isn't doing any
      parsing itself, so it isn't going to know where the DTD ends and
      the instance begins.

The application (assuming this is being passed over an API) must now:

  A1. receive part of a DTD that makes sense, followed by streaming
      text of a broken DTD, followed by the instance, and must figure
      out where the instance begins. Can we still determine that th
      instance is WF if the DTD is munged? Is the DTD displayed to the
      user? The whole DTD? The understood part of the DTD? What kind of
      error notification could indicate to a user (who doesn't know what
      a DTD is) why the problem occurred?
      
  A2. Does the application now ignore the DTD, since the processor has
      managed to handle it just fine? That is, we have a streaming input
      that has passed the DTD information across the API, and now an error
      has occurred in the document content. Upon encountering the error,
      does having the DTD or not make any difference on what the user
      sees? Maybe the user would have been able to make sense of the 
      raw, broken text if given access to the DTD.

As I said, I'm not asking for answers, only posing questions the answer to
which I think are better left out of a syntax specification. If we begin
to define behaviors for these types of errors, where do we stop? 

> Error handling requirements are consumer-based.

Absolutely, and should be left to application developers who know the
needs of their specific implementation. I agree with Todd; who knows 
what implementations we'll see XML in in five years. Nuclear reactor or
air traffic control documentation? Sure, why not? And if not, why not?

Have a good weekend all.

Murray

...........................................................................
Murray Altheim, SGML Grease Monkey                    <altheim@eng.sun.com>
Member of Technical Staff, Tools Development & Support
Sun Microsystems, 2550 Garcia Ave., MS UMPK17-102, Menlo Park, CA 94043 USA
         "Give a monkey the tools and he'll build a typewriter."

Received on Friday, 9 May 1997 21:38:57 UTC