XPath: many comments

Comments on "XML Path Language (XPath) Version 1.0"
(W3C Working Draft 13 August 1999)

Most of these are typographical/editorial. A few are substantive.

1 Introduction
--------------

8th para
    It's not clear whether the functions in the function library must have
    distinct names. (Two functions with the same name could still be
    distinguished by their parameter lists.)

    Shouldn't there be a paragraph briefly saying what a namespace
    declaration is?  For example:
        The namespace declarations constitute a mapping
        from prefix names to URIs.

9th para
    "change the context node/position/size"
        Well, they aren't really *changed*; it's just a different context.
        But I suppose it's okay for a casual statement of the semantics.

2 Location Paths
----------------

5th para
    "the second step"
        This phrase is odd, because there is no "first" step under
        discussion. Perhaps change to just "that step".

2.3 Node Tests
--------------

3rd para
    I find this paragraph confusing, because it doesn't sufficiently
    distinguish between name expansion in the expression, and name expansion
    in the document. Here is a suggested rewording.

        A QName in the node test is expanded into an expanded-name with a
        local part and a possibly null namespace URI, as follows:
        --- The LocalPart of the QName provides the local part of the
            expanded-name.
        --- If the QName has a prefix, then the namespace URI of the
            expanded-name is the URI associated with that prefix in the
            namespace declarations of the expression context.  It is an
            error if the QName has a prefix for which there is no namespace
            declaration in the expression context.
        --- If the QName does not have a prefix, then the namespace URI of
            the expanded-name is null.
        (Thus, if the expression context includes a declaration for the
        default namespace, that declaration will always be ignored.
        Note that this is similar to the expansion of attribute names,
        as defined by [XML Names].)
        The node test will be true for any node of the principal type whose
        expanded-name (see [5 Data Model]) equals the QName's expanded-name.
        This is the case when they have the same local part, and either both
        have a null namespace URI or both have the same namespace URI.

5th para
    "a QName using":
        Insert comma after "QName".

2.4 Predicates
--------------

1st para
    "is defined to be position":
        Insert "the" before "position".

2nd para
    "nodes in node-set":
        Insert "the" before "node-set".

2.5 Abbreviated Syntax
----------------------

2nd para
    "In effect child is the default axis."
        Insert comma after "effect".


NOTE
    The two occurrences of "foo" should presumably be "para".

    Replace the "descendant" axis with "descendant-or-self", otherwise the
    two paths mean different things in an additional way that isn't really
    important to the note.


3.1 Basics
----------

1st para
    "It is an error if the variable is not bound":
        Change "variable" to "variable name".

3.2 Function Calls
------------------

production 16
    Insert a space before each of the meta-syntactic close-parentheses.
    (After all, there's space around the open-parentheses.)

1st para
    It's odd to refer to "the function" before it is identified in the
    subsequent clause.  Moreover, there should presumably be an "undefined
    function" error. Here's a suggested rewording:

        The evaluation of a FunctionCall expression begins by identifying a
        function in the function library of the expression evaluation
        context, whose name equals the FunctionName of the FunctionCall.
        It is an error if such a function does not exist. Then, each of the
        Arguments is evaluated, and (if necessary) the result is converted
        to the type required for that Argument by the function. It is an
        error if the number of Arguments is wrong, or their values cannot
        be converted to the required types. Then, the function is called,
        passing it the converted arguments. The result of the FunctionCall
        expression is the result returned by the function.

    (This also covers the last sentence from the subsequent paragraph.)

3.3 Node-sets
-------------

3rd para
    "Square brackets":
        Change to "Predicates".

4th para
    "an arbitrary expression":
        Delete "arbitrary", because it isn't.

3.4 Booleans
------------

2nd + 3rd paras
    "converting its value to a boolean":
        Append "as if by a call to the <B>boolean</B> function".

4th para
    "A RelationalExpr or an EqualityExpr":
        This is redundant, since every RelationalExpr *is* an EqualityExpr.
        Moreover, the sentence clearly does not apply to RelationalExprs
        that are simply AdditiveExprs.  Therefore, replace the phrase with
        "An EqualityExpr that is not an AdditiveExpr".

6th para
    "they both consist of the same sequence of UCS characters":
        "both" is redendant here. Delete.

7th para
    "<=, <, >=, >":
        Insert "or" before ">".

    "by converted":
        Change to "by converting".

3.5 Numbers
-----------

production 27
    Is there any point to allowing expressions such as --E and ---E?
    If not, change the RHS "UnaryExpr" to "UnionExpr".


3.7 Lexical Structure
---------------------

3rd para
    "to disambiguate the grammar":
        Insert "ExprToken" before "grammar". The *Expr* grammar (using the
        RHS symbols of [28] as its terminal symbols) is certainly not
        ambiguous.

        When an XPath parser divides a character string into tokens, is it
        required to classify them according to the RHS symbols of [28]?
        Personally, I'd prefer to classify them more simply, thereby
        avoiding the need to use the special tokenization rules.
        For instance, if the tokenizer classifies "foo" as an NCName token,
        it doesn't have to decide whether it's a NodeType or a FunctionName
        or an AxisName; if it classifies '*' as just '*', it doesn't have
        to decide whether it's a WildcardName or a MultiplyOperator. Of
        course, by not making these distinctions in the tokenizer, I'd have
        to make them in the parser, but there they'd become run-of-the-mill,
        i.e., the normal machinery of the parser would make such
        distinctions automatically.

production [32]
    Insert a space after '<'.

production [37]
    "WildcardName":
        I can understand why "*" or "NCName:*" would be considered
        "wildcards", but surely it's a misnomer for a QName. Perhaps
        "WildcardName" could be renamed as "NameTest".

5 Data Model
------------

last para
    Make "document order" bold, since this is its definition.

    Maybe define "reverse document order". Otherwise, someone might think
    it's the order in which the *last* character of each node occurs.

5.1 Root Node
-------------

1st para
    The root node "does not occur anywhere else in the tree":
        Well, you could say that of *any* node.  Perhaps you mean "nodes of
        this type do not occur anywhere else in the tree".

5.2.1 Unique IDs
----------------

1st para
    "the second element":
        Append "in document order"?

5.3 Attribute Nodes
-------------------

2nd para
    "the attribute did not have a prefix":
        Change "did" to "does".

Nowhere
-------

Shouldn't there be a section on Conformance?

-Michael Dyck
 jmdyck@netcom.ca

Received on Wednesday, 1 September 1999 04:08:44 UTC