[Bug 1374] New: [XQuery] some editorial comments on A.1 EBNF (notation)

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1374

           Summary: [XQuery] some editorial comments on A.1 EBNF (notation)
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XQuery
        AssignedTo: chamberl@almaden.ibm.com
        ReportedBy: jmdyck@ibiblio.org
         QAContact: public-qt-comments@w3.org


A.1 EBNF (notation)

"The following term definitions ..."
    Change "term definitions" to just "definitions".

--------------------
symbol

"Each rule in the grammar defines one symbol, in the form ..."
    This isn't a good definition for 'symbol'. But I'm not convinced it needs
    one. (Note that the corresponding chunk of the XML spec doesn't bother
    trying to define it.)  Maybe just drop the surrounding "[Definition:", "]".

--------------------
terminal

"A terminal is a single unit of the grammar that can not be further subdivided,
and is specified in the EBNF by a character or characters in quotes, or a
regular expression."
    This isn't a good definition for 'terminal'...

    "that can not be further subdivided"
        In order for this to be a useful component of a definition, you would
        have to answer the question "Subdivided how?".

        Note that anything consisting of two or more characters can at the
        very least be subdivided into its component characters. Stepping up
        a level, many of the terminals enumerated in A.2.1 can be conceptually
        subdivided, e.g.:
            a DoubleLiteral into its mantissa-part and exponent-part;
            a StringLiteral into its delimiter chars and content chars;
            a CommentContents into 'top-level' content and 'nested' content; and
            a QName into its prefix and local-part.

        It's possible you're thinking of "subdivided by insertion of
        whitespace", but that would lead to a circular definition:
            Q: Where can I insert whitespace?
            A: Between the terminals.
            Q: But what are the terminals?
            A: They're the things you can't insert whitespace into.

        So I recommend you don't try to answer the question of "Subdivided
        how?", because:
        (a) I don't think it has a good answer, and
        (b) I don't think 'non-subdividability' is an essential part of
            defining 'terminal'.

    "and is specified in the EBNF by a character or characters in quotes,
    or a regular expression."
        But that's everything. The RHS of every production is a regular
        expression over the symbols of the grammar.

        Again, you could try to refine the existing phrasing to more accurately
        convey what you have in mind, but I think the result would be messy,
        and unnecessary.

    In fact, I'm pretty sure that this section has no need to define or use the
    word "terminal".  (It has no bearing on the "meaning" of the EBNF notation.)
    And if/when you *do* need to define it (in A.2), it's much simpler to just
    enumerate the symbols that you want to be considered terminals.

"The following expressions"
    Maybe change "expressions" to "constructs" or "patterns".
    (Yes, the XML spec calls them 'expressions', but the XML spec doesn't use
    the word for anything else. The XQuery spec certainly does.)

"are used to match strings of one or more characters in a terminal:
    Delete "in a terminal". These constructs are also used in productions for
    symbols that *aren't* listed as terminals in A.2.1.

#xN
[#xN-#xN]
[#xN#xN#xN]
[^#xN-#xN]
[^#xN#xN#xN]
    The XQuery grammar doesn't use any of these constructs. Delete them.

[^a-z]
    The XQuery grammar doesn't use this either.

"[abc] Enumerations and ranges can be mixed in one set of brackets."
"[^abc] Enumerations and ranges of forbidden values can be mixed in one set of
brackets."
    The XQuery grammar doesn't mix enumerations and ranges.

"matches a literal string matching that given inside the double/single quotes."
    Throughout this section, the usage of "matches" is:
        {construct in the grammar} matches {characters in a query}
    (As such, it means roughly the same as "derives" in its technical sense.)
    However, this sentence adds the usage
        {characters in a query} matching {characters in the grammar}
    which reverses the sense.

    How about this:
        "matches the sequence of characters that appear inside the double/single
        quotes"

"matches a production"
    No, matches any string matched by that production.

"For the purposes of this secificiation"
    Fix typo: "specification"

"the entire unit is defined as a terminal."
    Actually, you should probably delete this sentence.  Rather than saying the
    [http:...] construct is a terminal, it's simpler and presumably equivalent
    to just designate the production's LHS symbol as a terminal.  (According to
    A.2.1, CharRef, QName, NCName, and S have already been handled this way.
    PITarget and Char haven't.)

--------------------
production

"[Definition: A production combines symbols to form more complex patterns.]
The following productions ..."
    Ack! This is a gross misuse of the term "production". Not only does it
    conflict with standard usage, it conflicts with other (correct) usage
    within the very same spec!
        1 Introduction: "The following example production"
        3 Expressions: "the left side of the grammar production"
        A.1 EBNF: "comments on grammar productions"

    If you need a word for these constructs, call them "patterns".

"serve as examples, where A and B represent simple expressions:"
    There's no definition of "simple expressions".  In fact, the word "simple"
    is unjustified, since some of the constructs that stand in for A and B
    can be fairly complicated. (e.g., see
        [21] SchemaImport,
        [95] DirAttributeList,  and
        [140] DoubleLiteral)

So, covering the last two points, you could replace the paragraph with
something like:
    "Patterns (including the above constructs) can be combined with
    grammatical operators to form more complex patterns, matching more
    complex sets of character strings. In the examples that follow,
    A and B represent (sub-)patterns."

"(expression)"
    Change to "(A)" -- you've already set up A as a placeholder, so why not
    use it.

A B
"This operator has higher precedence than alternation; thus A B | C D is
identical to (A B) | (C D)."
    As far as I can tell, constructs such as 'A B | C D' do not occur in the
    XQuery grammar , so it is unnecessary to define the relative precedence
    of juxtaposition and '|'. Delete the sentence.

A+
"thus A+ | B+ is identical to (A+) | (B+)"
    No such constructs occur. Delete the sentence.

A*
"thus A* | B* is identical to (A*) | (B*)"
    No such constructs occur. Delete the sentence.

(angle-bracket groups)
    Since the angle-bracket group is a notation used in the grammar, it should
    be defined here. I suggest putting it right after "(expression)", since
    it has a similar flavour. (In each case, the grouped construct matches the
    same thing as the ungrouped construct.)

Received on Wednesday, 11 May 2005 07:26:59 UTC