Re: How is ambiguity defined?

> On 5,Jan2022, at 12:43 PM, C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> wrote:
> 
> 
> Approach C:  EBNF (derivation by expressions)
> 
> We could say there are obviously two ways to derive the sentence in
> the EBNF:
> 
>  1 S
>  2 'a'*
>  3 ''
> 
> and another is 
> 
>  1 S
>  2 'b'*
>  3 ‘'

These should probably be different:

    1 S
    2 ‘a’* | ‘b’*
    3 ‘a’*
    4 ‘'

and similarly for the other one.
    
These derivations above appear to rely on a rule that we can replace 
an expression E in a sentential form with another expression which 
recognizes some non-empty subset of L(E).  But given that rule, 
since L(‘a’* | ‘b’*) includes the empty string, we could simply write:

    1 S
    2 ‘a’* | ‘b’*
    3 ''

Rewriting them in this way seems to show that derivations in this 
style really don’t resemble parse trees the way that derivations using a 
BNF grammar do.

One possible solution that seems relatively clean would be to say
that for purposes of ixml, ambiguity is production of two or more 
different XML outputs (different after canonicalization, if we have 
to specify).  Detecting that will not necessarily be easy or cheap, 
since multiple raw parse trees may turn into the same XML.  And
since it’s not easy or cheap, detecting ambiguity maybe needs to be
downgraded to a SHOULD or MAY.

It also ignores the fact that an ambiguous grammar in which all the
ambiguities involve whitespace I really don’t care about will still cause
unnecessary work for a parser, so I probably do want to hear about 
those ambiguities.

So another possible solution would be to say that if the processor
detects more than one way to parse the input using the grammar,
the parser may report ambiguity, and since parsers are allowed to 
rewrite the grammar in any way that preserves the form of the XML
output, the result is that on any given case, conforming some parsers 
may detect ambiguity where others do not.

I continue to dislike both of these solutions, and every other
solution I have thought of.

Michael

Received on Wednesday, 5 January 2022 20:14:52 UTC