[Bug 3737] [FT] EBNF snippets confusing

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3737





------- Comment #1 from doerre@de.ibm.com  2006-10-13 12:58 -------
I agree that we have a problem in the exposition of the grammar and description
of the constructs in Section 3, but I do not think that this is a problem of
the EBNF, nor that it could be fixed by "tweaking" the EBNF.
When XQuery-FT adopted the TexQuery proposal we had simple grammar rules where
you could explain the language constructs 1-1 with the grammar rules. To adapt
to the style of the XQuery grammar we had to change that. Now we have a grammar
that is LL(k) parsable and that also reflects the operator precedences in the
grammar rules, like XQuery base. Thus a parser can automatically built from the
grammar without further ado.
What we have missed to change is the way we describe the language constructs in
terms of the grammar rules. For instance, in the Spec we talk about FTAnd as if
that would be an &&-expression, meaning an FTSelection that is composed of the
&& operator plus operand FTSelections. But that is not the case. FTAnd has this
grammar rule now:

FTAnd ::= FTMildnot ( "&&" FTMildnot )*

Hence, it is an abstract grammar symbol that just has the potential of
expanding to an &&-expression, but it need not. Thus, when explaining the &&
operation we should not confuse the &&-expression with the FTAnd grammar
symbol, like in 3.1.3.:
"FTAnd finds matches that satisfy both of the selection criteria."
The same applies to many other places in Section 3. For instance, all the
proximities (FTOrder, FTDistance, FTWindow, FTContent, FTScope) are explained
as if the grammar symbol (for instance FTOrderedIndicator) represents a full
FTSelection involving that operator. But the grammar symbol in these cases only
expands to the operator itself ("ordered" in this case). 

In the XQuery spec there is the same mismatch between grammar symbols and the
language constructs you would like to explain, but the editors there do a good
job of keeping those apart where necessary, e.g. they talk about an
"or-expression" to describe the expression involving the logical "or" operator
and do not confuse that with the grammar symbol "OrExpr". Also in that Spec the 
grammar of expressions is explained by first giving the top-level rules Expr
and ExprSingle, but then introducing the different kinds of expressions
bottom-up starting with all the PrimaryExpressions. There is also no obvious
relation between Expr and PrimaryExpr at first. I don't think this is a problem
for our Spec either. It is ok to start out with talking about FTSelections in
general, but then starting to gradually introduce the different kinds
bottom-up. Of course, it is not ideal from a pedagogical point of view that
when explaining the "&&" we are using a grammar rule with an in this place
totally unmotivated FTMildnot. But that's how the grammar is built for reasons
of encoding the operator precedences. The same is actually true for XQuery
base, e.g. when introducing Intersection the grammar rule mentions
InstanceOfExpr!

But we need to change our exposition to not confuse the grammar symbols of that
particular LL(k) grammar and the general language constructs.

/Jochen

Received on Friday, 13 October 2006 12:58:23 UTC