W3C home > Mailing lists > Public > public-qt-comments@w3.org > July 2005

[Bug 1384] [XQuery] some editorial comments on A.2.1 Terminal Types

From: <bugzilla@wiggum.w3.org>
Date: Sat, 09 Jul 2005 10:23:12 +0000
To: public-qt-comments@w3.org
Message-Id: <E1DrCUK-0004Rp-8S@wiggum.w3.org>


------- Additional Comments From scott_boag@us.ibm.com  2005-07-09 10:23 -------
(In reply to comment #0)
> A.2.1 Terminal Types

> (intent)
>     Is it your intention that the characters of every legal query can be
>     partitioned into a sequence of terminals and intervening whitespace?


>     If so, you'll need to add the following as terminals:
>         Char


>         "(#" and "#)" (or else Pragma)


>         PITarget

Fixed.  These were all production problems.

>     On the other hand, if it's your intention to only include terminals that
>     could be next to ignorable whitespace, then there are a bunch that could 


>     Note also that some of the terminals derive forms containing other
>     terminals, which could complicate things:

Yes, perhaps.  May come back and revisit this.

> 'terminal'
>     This section waffles between two senses of 'terminal': a symbol in the
>     grammar, or a group of characters in a query. (E.g., "The XQuery grammar
>     defines 153 terminals." vs. "This query contains 1200 terminals.") Maybe
>     nobody will mind.

Michael Sperberg-McQueen and I have been working on a clearer definition of

<termdef term="terminal" id="terminal">A <term>terminal</term> is a symbol or
string or pattern that can appear
    in the right-hand side of a rule, but never appears on the
    left hand side in the main grammar, although it may appear
    on the left-hand side of a rule in the grammar for terminals.</termdef>

And going back to dividing the grammar into a main grammar and a section for
terminals.  I'm still tweaking the details of this, but I'm hoping it will clean
up a bunch of these problems, at least to a point.

> "delimit"
>     The rest of the spec uses "delimit" to mean "mark the start and end", e.g.:
>         -- '(:' and ':)' are comment delimiters,
>         -- braces delimit an enclosed expression, and
>         -- a string literal can be delimited by apostrophes or quotation marks.
>     However, in this section, the intended usage appears to be, for example,
>     that in
>         x+1
>     the plus sign "delimits" x and 1. This is odd. It would be plainer to say
>     that it "separates" them.  (Note that in A.2.2.1, the examples use the
>     phrase "separated by DTs", not "delimited by DTs".)

I used the term "separate" in the text, but I kept the terms, as I think they're
clear enough.

>     So I'd recommend changing "delimit/delimited" to "separate/separated".
>     However, I don't recommend changing "(non-)delimiting terminal" to
>     "(non-)separating terminal", as both seem like odd phrasing to me. 

Right, agreed.

> For
>     instance, in
>         x+1-y
>     it would seem reasonable to say that the 1 separates (or delimits, if you
>     must) the plus and minus. So to decree that the plus is "separating"
>     whereas the 1 is "non-separating" doesn't make sense.  Instead, I think
>     "adjoinable" and "non-adjoinable" might be better, or "punctuation-like"
>     and "word-like", or "closed" and "open", or just "class 1" and "class 2".

I think the names for the categories are clear enough.

> ----
> "A DT may delimit adjacent NDTs."
>     This is not a definition. The real definition is the list.

I think it's clear enough as a term definition.  I don't really think the list
belongs in the term definition, and I would rather not restructure this in a
radical way.

Hmm... I may revisit this.  Maybe Michael can come up with some alternative wording?

> (list of DTs)
>     Delete initial comma.


>     Delete "%%%".


>     Maybe change """ to '"'.

Not worth the extra work right now.  May revisit.

>     Not sure why you need both Comment *and* "(:" + ":)".

"(:" + ":)" have been removed from the list.

>     Put them in ASCII order? Some kind of order would be nice.

Yes it would.  They're in grammar occurance order now.  Unfortunately, it's a
bit of a technical challenge to sort them (the joys of recursive processing in
XSLT 1.0).  I may get to this at some point, but it's got to be low on the
priority list.

>     "." and "-" are going to cause problems, given that they're valid
>     NCNameChars. E.g., if 'x' and '1' are two NDTs, and '-' is the DT that
>     will 'delimit' them, you get x-1, which doesn't work. (It's misrecognized
>     as a single NCName.)
>     Expressing the problem in a different way: A.2.2.1 says that whitespace is
>     only required between two NDTs, but '-' is not an NDT, so whitespace isn't
>     required between 'x' and '-'. Which is not what you want.
>     On the other hand, you can't make "-" an NDT, because then things like
>     100-x and (blah)-1 would become illegal.

Yes, need to sleep on this.  I think I need to make some sort of exception for
these cases.
> "NDTs generally start and end with alphabetic characters or digits."
>     This is almost a definition, but the "generally" makes it too vague.
>     Again, the real definition is the list.

Ok, redefined as <termdef id="non-delimiting-token" term="Non-delimiting
Terminal"><term>Non-delimiting terminals</term>  terminals must be separated by
a  <termref def="delimiting-token">delimiting terminal</termref>.</termdef>

I know you still don't like this.  Take it as a placeholder for now, I may revisit.

> "Adjacent NDTs must be delimited by a DT."
>     This doesn't belong in a definition.


> (list of NDTs)
>     Change ValidationMode to just "lax", "strict".


>     Definitely put them in ASCII order.

Yes, as I said, has to take lower priority.

> ----
> "delimit adjacent"
>     Both "definitions" have a phrase along the lines of:
>         "a DT [may/must] delimit adjacent NDTs"
>     but this makes no sense -- if the DT is between the NDTs, then the NDTs are
>     not adjacent! Presumably you mean something like
>         "two NDTs may not be adjacent"
>     or
>         "two adjacent terminals may not both be NDTs"
>     or
>         "in every pair of adjacent terminals in a query, at least one of the
>         terminals must be a DT"
>     However, a reasonable response (to any of these) would be:
>         But I have no choice! For instance, in the production for VersionDecl,
>         it says right there:
>             "xquery" "version" etc.
>         So the (NDTs) "xquery" and "version" *have* to be adjacent. I can't just
>         grab some DT and stick it between them -- I'd get a syntax error!
>     The answer, I assume, would be:
>         Ah, but S and Comment are DTs, and you certainly *can* (and in fact,
>         must) put either or both of those between the "xquery" and "version".
>     This illustrates a couple of problems:
>     (1) What you mean here by 'adjacent' may not be what the reader thinks.
>     (2) S and Comment are the only DTs that people can actually use to separate
>         two NDTs (without changing the structure of the query), and they are
>         buried in the list, and not mentioned in the prose.
>     It might help solve both of these problems if you moved/recast the content
>     of this section into A.2.2 Whitespace Rules. (As far as I can tell, the only
>     reason to define these classes of terminals is to be able to define where
>     whitespace is allowed, and where required, so the move is appropriate.)

You may be right.  I'm going to come back to this after I've processed the other
Received on Saturday, 9 July 2005 10:23:18 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:25 UTC