Re: XQuery grammar issues

Hi Bas.  Thanks for the really great feedback.  We very much appreciate it.

Just for your background information: The BNF is produced from a grammar
definition in XML.  From that we produce the BNF in the document, and also
produce JavaCC and YACC/LEX test parsers (though unicode is a significant
problem with LEX, of course).  In the future we may publish the full
grammar XML, in addition to the BNF, though the WGs have not made a
decision on this yet.

> I suggest adding
> XmlComment and XmlProcessingInstruction and also CdataSection to the
> Constructor production.

That sounds reasonable to me at first glance.  We'll either do that, or
state explicitly why we don't want to do it.

> A precedence table is not sufficient for expressions more complicated
> than simple binary operators.

The grammar XML uses the explicit technique.  It was for stylistic reasons
that we used the implicit technique.  We'll take your comment about this
under advisement to decide if we want to change it, or add any missing
information that is not clear from the implicit technique.

> Taking the precedence table literally, I
> cannot use an OrExpr in a WhereClause or even as the condition of an
> IfExpr without adding an extra pair of parentheses, making the OrExpr a
> PrimaryExpr.  So this is currently not allowed:
>
> if (foo or bar) then expr1 else expr2

I don't think there's a problem here, but I could well be mistaken.  I need
to go off and study this potential issue more in detail.  I will get back
to you with a detailed response in a few days.  (Both if and where
expressions work fine in the JavaCC test parser).

> - The ElementContent production allows computed element/attribute
> constructors.  Surely this is a mistake.  It would introduce keywords in
> element content.

Yes, that's a bug.

> - You no longer explicitly mention whitespace in the grammar for XML
> constructs, particulary in ElementConstructor.

Yes, that was a stylistic decision.  We felt that the statement "For
readability, whitespace may be used in expressions even though not
explicitly allowed by the grammar" etc. covered this.  Why doesn't this
apply equally as well for the ElementConstructor?

> If it is,
> this would allow the end tag </ foo>, which is not allowed in XML.  Is
> this intentional?

Really good point.  Perhaps we should state where whitespace is *not*
allowed?  (BTW, the whitespace in tokens is explicit in the grammar XML,
and the case you state above is covered.  No, < foo> and </ foo> are not
allowed.)

> Also, I presume whitespace is allowed within tokens, such as in "cast"
"> as" (obvious) "child" "::" (because it is allowed in XPath 1.0), but
> not around the ':' in a QName, or between the "&" and the "amp;".

right.

> In other words, this is completely inconsistent.

Is the inconsistency you're worried about or that it's just not specified?
I mean, yes, it is inconsistent, but the inconsistency has a reasonable
grounding in XML.

> - A.3 3rd bullet mentions whitespace after '/' and '//'.  It is
> completely unclear how to use this remark in the given lexical structure
> and grammar.  Unlike with '<', it does not create another token.  Also
> "// div foo" with "div" as an operator is meaningless and does not
> parse.

This paragraph should have been removed, I think.  It is a holdover from
some earlier stuff we were trying.  "/ div foo"  would have to be expressed
with "(/) div foo", according to a decision we reached, specified in A.3.1.

> - The Ref and Colon tokens seem not to be used.

Thanks.  We'll remove them in the next draft.

> The SemiColon token
> does not even have a production.

Thanks.  A bug in the grammar XML to BNF production that I didn't notice.

> - In element content <?foo foo?> is lexed as ProcessingInstructionStart
> PITarget Char PITarget ProcessingInstructionEnd.  This is not allowed by
> the grammar.

Yes, needs some work I think.  Same state as attribute content so you can
use enclosed expressions?

> - A TagQName also allows an initial ':'.  I see no reason to allow this.
> Why not restrict it to NCName (":" NCName)?

There are some open issues as regards the initial ':'.   We'll see how
these resolve first before deciding on this.

> - I suggest specifying that end of line translation is done as in XML
> (it is now unclear).  I also suggest using the same translation in
> string literals.

Good point.  We'll take it under advisement.

> - The Char production should not specify [#x0020-#xFFFD], but
> [#x0020-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] as in XML.

Yes.

> - It is impossible to distinguish Multiply and Star.  They should be 1
> token.

This is an open issue that is mixed in with the reserved words issue and
specifying lexical states at the main expression level, like XPath does.

> - A.3.1 states "An operator that immediately follows a "/" or "//" when
> used as a root symbol, should not parse"  First, "//" cannot be used as
> a root symbol.  Second, this restriction is useless and unnecessarily
> restrictive, except for the * (multiply) operator.  Why disallow e.g. "/
> == ."?

The "//" is a bug that will be fixed.  But do you think we should disallow
"/ div ." but allow "/ == ."?  OTOH, "/ == ." might be a quite reasonable
thing to do, while "/ div ." is much more unlikely.  So we'll discuss this.

> - The transition table mentions an XQUERY_COMMENT state that is never
> entered.

Thanks.

> - Section 2.3.5 item 4 says that "." is short for "self::node()".  This
> contradicts 2.1.1.2, which says that "." is the context item.

Thanks.

> - Section 2.8.1 last sentence says "Two adjacent curly braces in an
> XQuery character string are interpreted as a single curly brace
> character."  This suggest that it also holds in a string literal, but I
> presume this is not the case.

You presume correct.  We'll need to fix this.

> - There really should be a way to put special characters in a string
> literal, for instance using the same convention as XML.

Yes, known issue.  See
file:///E:/xqed/xquery.html#xquery-escaping-quotes-and-apostrophes.  It
just didn't make it in yet.

-scott

Received on Friday, 21 December 2001 11:58:44 UTC