- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Fri, 04 Feb 2022 06:35:29 -0700
- To: Dave Pawson <dave.pawson@gmail.com>
- Cc: Norm Tovey-Walsh <norm@saxonica.com>, ixml <public-ixml@w3.org>
Dave Pawson writes: > @MSM - do you mean 'in all circumstances'? > As Norm says, I've met a confirmed error, should I continue (could I > even continue parsing) > to the end? There is a formal and an informal answer. Informally, I would say: no, of course a parser need not waste time and cycles and memory on i/o of characters that cannot make a difference to the result. Once a parser knows that the input is not a sentence in the language defined by the grammar, it is of course free to return its result, which is that the input is not a sentence in the grammar (or: failed to parse, if you prefer). The question formulated in issue #24 has only ever been about whether a 'successful parse' (however we define that) has to consume / cover / parse the entire input or not. I do not understand why the answer "yes" should suggest to anyone that the answer has any relevance to the case of non-matching input. More formally: the current spec (I paraphrase, from memory) describes a conforming ixml processor as being presented with an input grammar and an input string (in some form) and doing one of two things: - returning an XML representation of a parse treee resulting from parsing the string against the grammar - informing the user that there is no such parse tree - failing for some other reason There have been proposals to remove the third item, but I don't believe we have yet discussed them so I hope it's still there. Nothing in this description, and nothing in the more detailed description of how ixml and ixml processors work, provides a definition for what it means to "consume input". So the formal answer to the question "do you mean that a processor must consume all the input even in case the input string is not a sentence in the language?" is the question "how would you even know?" > Doesn't sound like a sensible option from the outside? Would a user > be interested? In many > cases the first error compounds later ones etc? It is quite true that in the output from parses which attempt to recover from parsing errors, a first error is frequently followed by a flood of other errors (often because the recovery was imperfect). It is also true that the 'first error' -- that is, the location where the parser was first aware that there would be no parse tree -- is sometimes some distance away from the point at which the user finds the typo. But the ixml spec says nothing about what kind of error recovery a processor should perform, or what kind of diagnostic information a processor is to provide in case the input does not parse. I think that is the correct thing for the spec to say. > Parse to the end of the input string... unless errors are found? Is > that a reasonable caveat? Any process that correctly detects that the string is not a sentence in the language is producing a correct result. If "parsing to the end of the input string" means successively placing each character of the input in a register for examination and doing so even when the correct result of the calculation is already clear, then yes, but it's an observation not a caveat. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Friday, 4 February 2022 13:35:52 UTC