Re: XQuery lexical stuff from Michael Dyck on 2002-02-26 (www-xml-query-comments@w3.org from February 2002)

From: Michael Dyck <michaeldyck@shaw.ca>
Date: Tue, 26 Feb 2002 00:49:09 -0800
To: scott_boag@us.ibm.com
Cc: www-xml-query-comments@w3.org
Message-id: <3C7B4C05.47EBD37@shaw.ca>

scott_boag@us.ibm.com wrote:
> 
> I am confused by your CFG replacement for the lexical states.  While
> there's no question it would be nice to get rid of lexical state
> recognition, I'm still not sure how you CFG fixes things with the
> tokenizer.

It doesn't (if I understand you correctly). It is (I think) exactly
equivalent (in terms of generated language) to the PDA defined in A.3.1.
Thus, it is just as "broken" as the PDA. One advantage of the CFG
over the PDA (in my opinion) is that it's easier to spot the mistakes
in the CFG.

But more than that, I wanted to indicate that the PDA is redundant:
leaving aside its mistakes, it doesn't tell you anything that you
couldn't derive from the full grammar (productions 1 through 216). This
is a lot easier to see when you look at the PDA's equivalent CFG than
the PDA itself.

Ultimately, I wanted to question the need for the XQuery spec to define
a tokenizer at all.

> For instance, isn't EndTagClose still ambiguious against Gt?

In the PDA, its equivalent CFG, and the full grammar, there is no
ambiguity involving those two symbols, because they occur in different
contexts. (Actually, the Gt symbol doesn't appear in the PDA, but that's
presumably an oversight.)

A conflict between the two symbols would arise only in a conventional
(DFA) tokenizer for XQuery, which (sensibly) nobody is proposing.

> In any case, I just wanted you to know that your note hadn't been missed or
> dismissed by the WGs.  I may be in touch with you with more follow up
> questions.

Great! Thanks.

-Michael

Received on Tuesday, 26 February 2002 03:55:28 UTC