- From: <scott_boag@us.ibm.com>
- Date: Sat, 6 Sep 2003 12:25:02 -0400
- To: "Kevin Jones" <kjones@actuate.com>
- Cc: public-qt-comments@w3.org, public-qt-comments-request@w3.org
- Message-ID: <OF3F7FD4CD.6731D4DD-ON85256D99.0052E051-85256D99.005A2938@lotus.com>
Hi Kevin. Thank you very much for the comments. Replies inline below.
public-qt-comments-request@w3.org wrote on 09/05/2003 05:38:02 PM:
>
> Hi Scott,
>
> Here is another round of issues that I found with regard to the 22
> August 2003 draft.
>
> 1) Seperator should be spelled Separator
The only place I found this is with the (now fixed) QuerySeparator, which
actually shouldn't be published (read below). Common (embarrassing)
spelling mistake of mine.
>
> 2) The QuerySeperator token (DEFAULT and OPERATOR states) is not
> defined in the spec
Bug. We only use the QuerySeparator internally for testing purposes... it
is not part of the language, so it should not occur in these lists.
>
> 3) QName "(" - Why is this token group needed? Isn't the
> disambiguation of QName "(" from QName "(:" handled by the longest
> match rule when the '(' is encountered?
Not useing QName "(" as a single long token in the test parser causes a
choice conflict that would need to be solved by LL(2). Consider "foo" and
"foo ()". If both 'foo' words are QNames, the parser can not decide which
branch to take. But note that the grammar itself is saying is that there
is a choice issue here, and the implementation needs to solve it... it
doesn't say how it needs to solve it. I only picked one solution for the
test parser.
I'm not sure what you are saying about QName "(:". The grammar itself
should treat these as two tokens. As I said, there is a bug in the
current implementation... it needs to sniff ahead one character (at lex
time) and reject QName "(" in this case. This is the only place this
occurs in the grammar, so it's a drag, but I don't want to revisit the
comment syntax again. In any case, I'm not sure how the longest token
rule would help, except maybe to make a token for QName "(:" to catch this
case (but then I'm not sure what you do with it from there...).
>
> 4) State changes are mixed in with token groups. How is this
> reconciled? Aren't token groups expected to be processed in the same
state?
I don't think I understand your comment here. A group of tokens in the
state transition table is treated as a single unit. (Though, again, this
is only a way of documenting unambiguous behavior... an implementation can
do what it wants.)
>
> 5) The following is more thinking out loud than an issue.
>
> Q: Why are the following designated "named terminals" when they
> might be more easily represented as grammar productions?
> SchemaMode ::= "lax" | "strict" | "skip"
> SchemaGlobalTypeName ::= "type" "(" QName ")"
> SchemaGlobalContext ::= QName | SchemaGlobalTypeName
> SchemaContextStep ::= QName PITarget ::= NCName VarName ::= QName
> A: Because they initiate state changes.
>
> Possible solution: Leave as is or make them productions and add
> state change in the grammar
>
> Can you think of any other way that this could be handled to make
> the distinction between lexical analysis and parsing more clear.
I'm assuming you've read http://www.w3.org/TR/xquery/#parse-note-validate.
This whole area of the validate expression is thorny -- and I don't really
have any bright ideas beyond that I've done... I think it's good enough,
and other solutions turn into nightmares in their own right. The
definition of SchemaMode as a "named terminal", for instance, is because
of <"validate" SchemaMode> in the appendex version of the BNF, and
everything inside a grouping has to be a lexical-only construct.
>
> 6) Can the explicit whitespace designation be removed from the
> grammar and placed at the lexical level and/or handled by lexical
states?
Well, I think the notation at the grammar production level is a good
thing. And it is specified more formally at the lexical level where the
states are listed for non-explicit whitespace in
http://www.w3.org/TR/xquery/#whitespace-rules. So I don't think there's
anything big that needs to be done there... though when I look at where
"S" falls in the lex tables, I think it is incorrect, and I'll have to
clean that up.
>
> regards,
>
> Kevin Jones
>
Thanks again for the comments!
-scott
Received on Saturday, 6 September 2003 12:26:10 UTC