Re: The semantics of the disambiguation constructs

I see two types of ambiguity. Lexical and Grammar, but they both show up as
an ambiguous parse in ixml.
If we can find the same technical solution for both then marvellous, but I
am inclined to think that we need two
different technical solutions to address them.

The whitespace ambiguity and eat the whole token please (do not try all
possible splits i+nt in+t int)
belong to the same lexical ambiguity that is normally handled by a
tokenizer which is greedy.
My "not charset" implementation that takes a rule such as:  name = [L]+,
![L].
will prevent the earley parser o complete the rule unless there is no more
letter to accumulate.
I.e. it becomes greedy.
I think everyone would love to define whitespace as s = [' ';#10]+, !['
';#10].
I think this would resolve a lot of whitespace ambiguity problems that we
experience.

Having a "not rule" does not make sense to me. I cannot see how to
implement it either.

Then we have the  grammar ambiguity, for example the dangling else problem.
And then we have a new grammar ambiguity that was handled by the tokenizer,
ie reserved keywords
translate into a unique token whereas all other strings translate into for
example a variable_name token.
John, dealt with this problem by creating the subtraction operator.

> The 'subtraction' operator I implemented (A¬B)  was perhaps more
restricted and designed to solve a specific problem in 'reserved keywords'
in the XPath grammars.

I implement instead a cost value for a rule instead. Normally all rules
have the same cost
and the resulting tree is random of the possible ambiguous matching rules.
But if a rule has higher cost than the others, it will not be picked. Thus
you give the catchall rule a higher cost.

Johns example above would be written in my tool as:
A =< ....

Where the < directly after the = means this choice has a higher cost, pick
other matching rule that have a lower cost if they exist.

So, two basic problems that give rise to ixml ambiguity warnings:
lexical (solved with greed)
grammar (solved with cost or subtraction)

//Fredrik

Received on Tuesday, 3 March 2026 12:27:33 UTC