Re: Is this ambiguous?

Norm Tovey-Walsh writes:

> Consider this grammar:

> list: word + -',' .
> word: c, v, c ; c, v, v .
> -c: ["bcdfghjklmnpqrstvwxyz"] .
> -v: ["aeiouy" ].

> and this input: “hey,bee”.

> By one reckoning, that’s ambiguous: “hey” can be either cvc or cvv
> because I’ve identified “y” as both a consonant and a vowel.

> But “c” and “v” are both elided from the output, so the generated XML
> is identical for both parses. By that reckoning, it isn’t ambiguous.

Nice example.

> I expect our intent is that it *isn’t* ambiguous…but I thought I’d
> check.

For what it's worth ...

If we as a group have an intent here, I don't know what it is.  Issue 26
asks how we wish to define "ambiguity", but the upshot of the discussion
starting from your message of 5 January [1] seems to me to have been
only that some members of the CG do not wish to define it; without a
definition of what counts as ambiguity, I don't think we can have a
coherent intent.

If I had to predict what the CG will end up doing, my money would be on
leaving the effective definition of ambiguity implementation-defined and
specifying that processors MAY report ambiguity, rather than MUST, or
possibly specifying that IF processors detect ambiguity they MUST report
it (modulo user option to suppress the ambiguity flag in the output),
but not requiring that they detect ambiguity whenever it exists.  (I
predict there will be CG members who would like to require the detection
of ambiguity, but without a crisp definition of ambiguity that
requirement lacks teeth.)

One difficulty is that we have already seen that different
implementations of ixml use different underlying parsing methods, even
when the implementors all say they are using Earley parsing.  But:

- Implementations that parse using the ixml grammar directly and those
  which translate the ixml grammar to BNF for parsing are working with
  different grammars; their raw parse trees will differ and ambiguity in
  one doesn't always mean ambiguity in another.

- Implementations which translate the ixml grammar to BNF will not
  necessarily use the same translations -- the obvious requirement is
  that the grammar be equivalent and allow the construction of the XML
  abstract syntax tree, and there is more than one BNF that meets those
  criteria.

- I think that some ways of recording parsing results may make it easy
  to see whether there is more than one XML AST for a given sentence,
  but I'm not sure that's true for every possible approach.
  
We don't want to constrain the internals of any implementation, and we
want interoperability, and we want ambiguity to be flagged.  I don't
think all of those three can be combined in their pure form; we are
going to have to weaken one or more of them.

My two cents.

Michael

-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

[1] https://lists.w3.org/Archives/Public/public-ixml/2022Jan/0030.html

Received on Sunday, 13 February 2022 23:03:23 UTC