W3C home > Mailing lists > Public > public-ixml@w3.org > April 2021

Re: resolving ambiguity

From: Tom Hillman <tom@expertml.com>
Date: Wed, 14 Apr 2021 12:52:42 +0100
To: public-ixml@w3.org, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-ID: <cc0d6c7f-7ef4-4dad-847d-21460ca33147@Spark>
In fairness, I suggested it nervously, and only then as part of an approach for the input consumption discussion: I would prefer it if we leave the choice open.

I think the answer to your first rule wins question is <S>(<a>a </a>)</S>. That is possibly because recursing into each non-terminal before handling its siblings is second nature to me because it mirrors the document order of XML trees ;)

Tom

_________________
Tomos Hillman
eXpertML Ltd
+44 7793 242058
On 14 Apr 2021, 01:42 +0100, C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>, wrote:
> On today’s call, Tom suggested that the spec might usefully say which tree should be returned, in case of ambiguity.
>
> I am nervous about this suggestion; I think that the choice of tree is better left open.
>
> "First rule wins" seems plausible enough when it applies, but I have not thought enough about it to know whether it always applies, whether it always provides a unique solution, or how to extend the Earley algorithm (or any of the other general algorithms) to keep track of such things in a way that would guarantee the selection of the correct tree. Consider this grammar:
>
> S: Left, Right.
> Left: "a", "b"; "a".
> Right: "b", "c"; "c".
>
> If I'm not mistaken, this has two parse trees:
>
> <S><Left>ab</Left><Right>c</Right></S>
> <S><Left>a</Left><Right>bc</Right></S>
>
> The first tree uses the first RHS for Left and the second RHS for Right. The second tree is the other way round. Does the "first rule wins" principle give an answer here? I suppose we can say that it's the choice of rules for 'Left' that matters, because 'Left' precedes 'Right'. And I guess that's what you meant.
>
> But consider:
>
> S: '(', a, s?, ')'.
> a: s?, 'a', s?.
> s: #20+.
>
> and the input string "(a )". Which of the parses is preferred by the first-rule-wins rule?
>
> <S>(<a>a </a>)</S>
> <S>(<a>a</a> )</S>
>
>
> Michael
>
> ********************************************
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> cmsmcq@blackmesatech.com
> http://www.blackmesatech.com
> ********************************************
>
>
Received on Wednesday, 14 April 2021 11:53:05 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 14 April 2021 11:53:05 UTC