- From: <bugzilla@wiggum.w3.org>
- Date: Fri, 14 Aug 2009 02:08:12 +0000
- To: public-qt-comments@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7271
Summary: XQuery Full Text 1.0 grammar is not LL(k) because of
"case" and "with" use
Product: XPath / XQuery / XSLT
Version: Candidate Recommendation
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Full Text 1.0
AssignedTo: jim.melton@acm.org
ReportedBy: nikolay.ognyanov@gmail.com
QAContact: public-qt-comments@w3.org
Reuse of "case" in TypeSwitchExpr and FTMatchOption makes the Full Text
grammar non LL(k) for any fixed k. Consider for example the following two
fragments of a TypeswitchExpr :
case a return b ftcontains "c" case sensitive
case a return b ftcontains "c"
case sensitive return ExprSingle
For the grammar to be LL(k), it must be possible to choose correct alternative
for FTSelection with k symbols of lookahead when looking at the fragment
b ftcontains "c" case sensitive
and beyond. At first glance the lookahead "return" solves easily the problem.
Besides CaseClause though "return" also appears in FLWORExpr (and TransformExpr
if Update is added to the grammar) where a fragment
b ftcontains "c" case sensitive return ExprSingle
can appear in trailing position. Choice of correct alternative does depend on
whether parsing happens within typeswitch or FLWOR. Therefore it is only
possible to distinguish between the alternatives in consideration by looking
beyond "return ExprSingle" (e.g. for typeswitch default clause). This can not
be done with any fixed amount of lookahead due to the recursive nature of
ExprSingle.
Analysis of problems caused by "case sensitive" applies also to "case
insensitive".
If Full Text is combined in the same grammar with Update then grammar becomes
non LL(k) also due to reuse of "with" in ReplaceExpr and FTMatchOption. For
example based on syntax only, the following are valid expressions :
replace node a ftcontains "b" with wildcards
replace node a ftcontains "b" with wildcards with something
It is very easy here to decide between the 2 alternatives for parsing of the
fragment
a ftcontains "b" with wildcards
by fixed amount of lookahead (assuming that at most 1 FTWildCardOption per
FTPrimaryWithOptions is allowed). The logic needed however only applies if the
fragment is embedded in ReplaceExpr and fails otherwise. On the other hand the
logic which has to be applied outside ReplaceExpr fails within it. The problem
with LL(k) then is that no amount of lookahead can resolve whether the fragment
is within or out of ReplaceExpr.
Analysys of problems caused by "with wildcards" applies also to "with stemming"
and "with thesaurus". It does not apply to "with stop words" because "stop
words" by itself is not a valid expression.
Regards
Nikolay
--
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Friday, 14 August 2009 02:08:22 UTC