Re: Error definition

> My view? It is an error.
> The input.txt file is 'in error'
> Hopefully the author of the processor will say @line 25 etc.

One reason we’re struggling with this topic is, perhaps, that we have
differing views about how a processor might be built.

Let’s look at what a Java compiler does, as a concrete example.
If you feed this program into a Java compiler:

package com.nwalsh;

public class MyProgram {
    public static void main(String[] args) {
        System.out.println("Hello, world")
    }
}

it will dutifully report

/tmp/com/nwalsh/MyProgram.java:5: error: ';' expected
        System.out.println("Hello, world")
                                          ^

This, I assume, would satisfy Dave’s view that the program is “in
error”.

But there’s more going on here than just what the user sees viewing the
entire compiler as a black box.

Internally, there’s some code for reading files off disk. If that code
failed, if the file didn’t exist or had permissions that prevented the
process from reading it, that would be an exceptional circumstance. The
code would be unable to fuction and an error would be raised.

Assuming the code was successfully read, it would next be handed to a
parser that would attempt to turn the characters of the program into an
abstract syntax tree (AST). The parts of the compiler that come next,
the optimizer, the byte code generator, etc. don’t want to deal with
words like “public” or “{“ delimiters. They want an abstraction that’s
cleared away the syntactic cruft.

The parser is going to report, “Sorry, mate, I couldn’t build an AST. I
got as far as about the end of line five before I reached a point where
I couldn’t match the input. You know what, I could have kept going if
there’d been a “;” there.”

Critically for this discussion, observe that the parser didn’t encounter
any kind of exceptional circumstance. It wasn’t prevented in anyway from
completing its function. No error has occurred. The input doesn’t match
the grammar for Java, but that’s a common and completely expected
result. (If you don’t think that’s common, just watch me writing Java.)

We aren’t going to make the meaning of the word “error” any clearer or
more precise by trying to make it do double duty. Asserting that failing
to find a parse is “an error” reduces the value of the word “error” as a
technical term.

The *user* can still be told than an error occurred. That’s fine.

But the ixml CG is mostly focused on the part of a larger program that
turns input grammars into vxml. Given a valid ixml grammar, discovering
that the input doesn’t match the grammar simply isn’t “an error”. It’s a
common and completely expected result.

One could imagine a Java compiler that would stick the semicolon in and
then hand the input back to the parser to try again. It might get
further this time, until a different grammar matching dead end, or even
until it succeeds.

That’s just completely different from an I/O error reading the file.

These kinds of subtle distinctions are useful to make specification
language clear and precise. If you lump everything that could possibly
go wrong under the term “error”, then “error” becomes less meaningful.

Again, this has nothing directly to do with what the user is told by the
program they were running.

To take a completely different example, consider

    System.out.println(Long.MAX_VALUE + 1);

If the answer -9223372036854775808 surprises you, you might say “that’s
an error!” But it isn’t. It simply isn’t. It’s a consequence of how
two’s complement numbers are stored in 64 bit blocks and what happens
when numeric overflow occurs.

Hoping that helps.

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Monday, 7 February 2022 07:43:39 UTC