- From: <scott_boag@us.ibm.com>
- Date: Sat, 6 Sep 2003 12:25:02 -0400
- To: "Kevin Jones" <kjones@actuate.com>
- Cc: public-qt-comments@w3.org, public-qt-comments-request@w3.org
- Message-ID: <OF3F7FD4CD.6731D4DD-ON85256D99.0052E051-85256D99.005A2938@lotus.com>
Hi Kevin. Thank you very much for the comments. Replies inline below. public-qt-comments-request@w3.org wrote on 09/05/2003 05:38:02 PM: > > Hi Scott, > > Here is another round of issues that I found with regard to the 22 > August 2003 draft. > > 1) Seperator should be spelled Separator The only place I found this is with the (now fixed) QuerySeparator, which actually shouldn't be published (read below). Common (embarrassing) spelling mistake of mine. > > 2) The QuerySeperator token (DEFAULT and OPERATOR states) is not > defined in the spec Bug. We only use the QuerySeparator internally for testing purposes... it is not part of the language, so it should not occur in these lists. > > 3) QName "(" - Why is this token group needed? Isn't the > disambiguation of QName "(" from QName "(:" handled by the longest > match rule when the '(' is encountered? Not useing QName "(" as a single long token in the test parser causes a choice conflict that would need to be solved by LL(2). Consider "foo" and "foo ()". If both 'foo' words are QNames, the parser can not decide which branch to take. But note that the grammar itself is saying is that there is a choice issue here, and the implementation needs to solve it... it doesn't say how it needs to solve it. I only picked one solution for the test parser. I'm not sure what you are saying about QName "(:". The grammar itself should treat these as two tokens. As I said, there is a bug in the current implementation... it needs to sniff ahead one character (at lex time) and reject QName "(" in this case. This is the only place this occurs in the grammar, so it's a drag, but I don't want to revisit the comment syntax again. In any case, I'm not sure how the longest token rule would help, except maybe to make a token for QName "(:" to catch this case (but then I'm not sure what you do with it from there...). > > 4) State changes are mixed in with token groups. How is this > reconciled? Aren't token groups expected to be processed in the same state? I don't think I understand your comment here. A group of tokens in the state transition table is treated as a single unit. (Though, again, this is only a way of documenting unambiguous behavior... an implementation can do what it wants.) > > 5) The following is more thinking out loud than an issue. > > Q: Why are the following designated "named terminals" when they > might be more easily represented as grammar productions? > SchemaMode ::= "lax" | "strict" | "skip" > SchemaGlobalTypeName ::= "type" "(" QName ")" > SchemaGlobalContext ::= QName | SchemaGlobalTypeName > SchemaContextStep ::= QName PITarget ::= NCName VarName ::= QName > A: Because they initiate state changes. > > Possible solution: Leave as is or make them productions and add > state change in the grammar > > Can you think of any other way that this could be handled to make > the distinction between lexical analysis and parsing more clear. I'm assuming you've read http://www.w3.org/TR/xquery/#parse-note-validate. This whole area of the validate expression is thorny -- and I don't really have any bright ideas beyond that I've done... I think it's good enough, and other solutions turn into nightmares in their own right. The definition of SchemaMode as a "named terminal", for instance, is because of <"validate" SchemaMode> in the appendex version of the BNF, and everything inside a grouping has to be a lexical-only construct. > > 6) Can the explicit whitespace designation be removed from the > grammar and placed at the lexical level and/or handled by lexical states? Well, I think the notation at the grammar production level is a good thing. And it is specified more formally at the lexical level where the states are listed for non-explicit whitespace in http://www.w3.org/TR/xquery/#whitespace-rules. So I don't think there's anything big that needs to be done there... though when I look at where "S" falls in the lex tables, I think it is incorrect, and I'll have to clean that up. > > regards, > > Kevin Jones > Thanks again for the comments! -scott
Received on Saturday, 6 September 2003 12:26:10 UTC