[Bug 1373] New: [XQuery] some editorial comments on A.1 EBNF (productions)

http://www.w3.org/Bugs/Public/show_bug.cgi?id=1373

           Summary: [XQuery] some editorial comments on A.1 EBNF
                    (productions)
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: XQuery
        AssignedTo: chamberl@almaden.ibm.com
        ReportedBy: jmdyck@ibiblio.org
         QAContact: public-qt-comments@w3.org


A.1 EBNF

(sectioning)
    I think that the EBNF productions and the explanation of the EBNF notation
    should each be split into a separate section.

----------------------------------------
preamble

"with the following minor differences."
    You've removed the phrase "except that grammar symbols always have initial
    capital letters" even though it's still true, still different from the
    notation used in the XML spec, and still unexplained.
    [Leftover from qt-2004Feb0317-01]

"a grouping of terminals that together may help disambiguate the individual
symbols."
    This (along with its repeat in A.2) is another (but I hope the last) misuse
    of the word "disambiguate" in its technical sense. Instead, you might say
    "... may help a parser differentiate various constructs", or just "... may
    help a parser do its job".
    (And similarly for the repeat of this sentence in A.2.)

    You should make clear that angle-bracket-groups have no definitional
    significance. That is, their presence in the EBNF has no effect on the set
    of syntactically legal queries defined by the grammar. (Assuming that's
    true. If not, then you've got some explaining to do.)

"To help readability, this "< ... >" notation is absent in the EBNF in the main
body of this document. This appendix is the normative version of the EBNF."
    You could say the same of comments on grammar productions.

"Comments on grammar productions"
    Note that the XML spec's grammar has production comments, so it's not their
    *presence* here that's different, but rather their normative power.

"clarification for parsing rules"
    Some grammar notes are not mere clarification, they actually affect the set
    of legal queries.

Pulling some of this together, how about restructuring the preamble into
something like this:

    The following grammar .... differences:

        o All named symbols have a name that begins with an uppercase letter
          (unlike the XML spec, where some names began with lowercase letters
          to draw a distinction ...)

        o It adds a notation for referring to productions in external specs.

        o (...angle-bracket groups...)

        o Production comments are normative.

    These features are described in more detail in [the Notation section].

    To increase readability, the EBNF in the main body of this document omits
    some of these notational features. This appendix is the normative
    version of the EBNF.

----------------------------------------
productions

"[66]  PragmaContents"
"[146] Digits"
"[155] CommentContents"
    These should be marked "ws: explicit".

[96] DirAttributeValue    ::=
         ('"' (EscapeQuot | QuotAttrValueContent)* '"')
       | ("'" (EscapeApos | AposAttrValueContent)* "'")
[97] QuotAttrValueContent ::=
         QuotAttrContentChar | CommonContent
[98] AposAttrValueContent ::=
         AposAttrContentChar | CommonContent

    I think these would be clearer if you didn't split each of the choices over
    two productions. Instead, how about:

        [96] DirAttributeValue    ::=
                 ('"' QuotAttrValueContent* '"')
               | ("'" AposAttrValueContent* "'")
        [97] QuotAttrValueContent ::=
                 EscapeQuot | QuotAttrContentChar | CommonContent
        [98] AposAttrValueContent ::=
                 EscapeApos | AposAttrContentChar | CommonContent

[142] StringLiteral
    Change ('"' '"') to EscapeQuot.
    Change ("'" "'") to EscapeApos.

(*ContentChar)
    If you factor out the overlap of ElementContentChar, QuotAttrContentChar,
    and AposAttrContentChar, and push it over to CommonContent, I think the
    result is simpler. That is, change this:

        [97] QuotAttrValueContent ::=
                EscapeQuot | QuotAttrContentChar | CommonContent
        [98] AposAttrValueContent ::=
                EscapeApos | AposAttrContentChar | CommonContent
        [99] DirElemContent ::= DirectConstructor | CDataSection
                | ElementContentChar | CommonContent

        [100] CommonContent ::= ...

        [151] ElementContentChar ::= Char - [{}<&]
        [152] QuotAttrContentChar ::= Char - ["{}<&]
        [153] AposAttrContentChar ::= Char - ['{}<&]

    to this:

        [97] QuotAttrValueContent ::= EscapeQuot | "'" | CommonContent
        [98] AposAttrValueContent ::= EscapeApos | '"' | CommonContent
        [99] DirElemContent ::= DirectConstructor | CDataSection
                | '"' | "'" | CommonContent

        [100] CommonContent ::= [^"'{}<&] | ...

        [151] ElementContentChar   delete
        [152] QuotAttrContentChar  delete
        [153] AposAttrContentChar  delete

    Just a thought.

(explicits)
    I wonder if it would help the reader if the "ws: explicit" productions (and
    the intervening ones that don't care whether they're "ws: explicit" or not)
    were put together in a group. Specifically:
        [65-66]
        [79]
        [93-106]
        [138-159]
    Then, instead of "productions marked with 'ws: explicit'", you might say
    "productions in the [whatever] group".

Received on Wednesday, 11 May 2005 07:24:46 UTC