- From: C M Sperberg-McQueen <cmsmcq@uic.edu>
- Date: Mon, 7 Dec 1998 12:58:39 -0600
- To: db@Eng.Sun.COM
- CC: xml-editor@w3.org, cmsmcq@uic.edu
>Date: Wed, 02 Dec 1998 14:12:46 -0800 >From: David Brownell <db@Eng.Sun.COM> Thanks for your note. The following reply should be taken as reflecting my personal views, not those of any WG, editorial team, organization, project, or institution. >I've noticed when testing against the spec that the definition of >an "error" in section 1 is particularly useless when it comes to >defining common behaviors: > > Error: a violation of the rules of this specification; > results are undefined. Conforming software may detect > and report an error and may recover from it. > >The problem is that this definition promotes wide variations in >handling errors: it permits parsers either to ignore the "error" >entirely, or to treat it as a fatal error. Correct; we expect parser developers to compete with each other in part by providing the error behavior best suited to a particular field of application. In some contexts that will mean dogged perseverance, and in others it may mean crash-and-burn-quick to avoid burning cycles unnecessarily. >Minimally, I'd suggest that this be tightened to require error >reporting "at user option" (as for validity errors). See below. >It'd also be advantageous to preclude treating errors as "fatal" >unless such treatment is specifically allowed in the spec. For >example, in 4.3.3 it might be appropriate to permit processors >to optionally report fatal errors when the encoding declaration >is sufficiently broken. I do not believe that errors in the encoding declaration (or in external encoding information) are always detectable, or distinguishable from other errors. If data arrives in ISO 8859-7 but it is tagged as ISO 8859-1, it will be very difficult for any software or hardware system to detect the error. If it arrives tagged as some variant of ISO 8859, but in fact it's EBCDIC, how will the system distinguish the actual error (in the encoding declaration) from other errors (corrupt data, error in use of delimiters, inclusion of illegal characters) which might produce similar results? >Why was this definition made so weak and fuzzy? Was it just that >there wasn't much implementation experience on which to draw? If >so, I think that there's plenty of experience now! There was some diversity of opinion in the WG, and I believe that diversity is reflected among the editors, but at least some members of the WG believed that at least some things which are (or should be) defined as errors in the spec are not necessarily detectable, or are detectable only at the cost of unacceptably limiting the possible implementation strategies. So we distinguished between errors which an implementation is required to detect from errors which an implementation is not required to detect, but which will generally result in unexpected and probably undesired (i.e. incorrect) results. Removing that distinction would have the drawback, in some cases, of requiring software to do things which software is logically incapable of doing -- or else of defining undetectable errors as non-errors (e.g. saying that when an ASCII file contains an EBCDIC encoding declaration, it is "not an error"). -C. M. Sperberg-McQueen
Received on Monday, 7 December 1998 13:59:12 UTC