Comments on XPointer CR

Comments on "XML Pointer Language (XPointer) Version 1.0"
W3C Candidate Recommendation 7 June 2000

1.2 Notation and Document Conventions

1st para:
"The circumflex (^) metcharacter used in this notation":
    After "notation", maybe insert "(to denote the complement of a set of
    characters)".

    It might be good to mention the validity constraints. For example,
    "The formal grammar is augmented with validity constraints, rules
    given in prose. Together, these specify the syntax of XPointers."
    (Otherwise, it's not clear whether "the syntactic requirements imposed
    by this specification" (3.2) and "the syntax specified in this
    document" (3.5) include the VCs.

2 XPointer Terms and Concepts

    I think it would be good to define "location" here, since it's fairly
    fundamental, relates to a term in XPath, and could simplify a couple
    of the other definitions.

"location-set":
    Change "document nodes, points, and/or ranges" to "locations".

"range":
    I think "a contiguous selection of" is redundant.

"singleton":
"A location that consists of a single, contiguous portion of a document.":
    A location is either a point, a range, or a node. Therefore, a location
    *must* be a single, contiguous portion of a document.  Thus, I believe
    that the definition of "singleton" should be:
        "A location-set that consists of a single location."
    The rest of the paragraph is pretty much redundant, given the defns
    of location and location-set. Also, the repeated use of the word
    "contiguous" might lure the reader into thinking that contiguity is the
    defining characteristic of "singleton", which 5.4.6 refutes in detail.

order of defns:
    Are the definitions supposed to be in alphabetical or logical order?
    If alphabetical, "sub-resource" is out of order. If logical, I think
    this order would be more logical:
        point
        range
        location
        location-set
        singleton
        sub-resource

3.5 Classes of XPointer Errors

"syntax error":
"and applications must not attempt to evaluate it as an XPointer"
    This restriction conflicts with the last sentence of the section:
        "This specification does not constrain how
        applications deal with these errors."
    I'd remove the restriction.
    (If you think it should stay, then why isn't there a similar
    restriction for resource errors? Why is there no restriction on the
    location-set yielded by an XPointer with a sub-resource error?)

    In any event, discussion of how applications deal with syntax errors
    should not be within the *definition* of "syntax error".

4.1.1 URI Reference Encoding and Escaping

1st para:
    This para seems to belong in section 4.1, if "other contexts" refers
    to XML contexts.

disallowed characters:
    It would be nice to have a non-normative list of them, for people who
    don't have Section 2.4 of RFC 2396 handy.

4.1.2 XML Escaping

example:
    It's missing the square brackets around "position() <= 5".
    (And similarly in the second box.)

"Note that if XML-based languages define elements or attributes containing
URI references (such as XLink's href attribute shown above), the relevant
element content or attribute values also require the processing defined in
4.1.1 URI Reference Encoding and Escaping."
    So shouldn't you complete the example by performing this processing?
    You could put the final result in a third box, so that the result of
    just the XML-escaping is still clear from the first two boxes.

    You might also note that if the URI-escaping were performed first, it
    would (among other things) change "<" into "%3C", which would (1) make
    XML-escaping unnecessary, and (2) result in a slightly different
    fragment identifier for the same XPointer.

4.2 Forms of XPointer

StringWithBalancedParens:
    The EBNF does not admit escaped (unbalanced) parentheses. Yes, the
    validity constraint allows them, but I think it's unusual for a VC to
    be more permissive than the EBNF. Normally VCs are more restrictive.
    In the EBNF, you could change "[^()]*" to
        ( [^()] | '^' [()] )*
    which you might want to split off into its own production.

    (This, like the original, allows arbitrary occurrences of circumflex.
    You could disallow it in the EBNF, but only by having two different
    uses of circumflex in the same expression. In the VC, you might want to
    say "Any other occurrence of circumflex results in a syntax error.")

Validity constraint: Non-XPointer schemes:
    So the use of any other scheme constitutes a syntax error. So when a
    future Recommendation wants to introduce a new scheme, the XPointer
    Rec will have to be amended to allow it?
    Wouldn't it be easier to allow future schemes by indirection (to some
    standard list maintained by W3C)?

Validity constraint: Parenthesis escaping:
    You should probably note that circumflex is one of the URI-disallowed
    characters, so it will have to be converted to "%5E" if the XPointer
    appears in a URI reference.

4.2.3 Child Sequences

2nd para
    Change "where" to "assuming".
    In "an element that is the fifth child", change "an" to "the".

4.3 Schemes

1st para
"an initial Scheme identifies the particular notation used for each
XPtrPart":
    Offhand, this sounds like a single Scheme governs multiple XPtrParts.
    Maybe change to:
        each XPtrPart begins with a Scheme that identifies
        the particular notation used for that XPtrPart

Note:
the XPointer errata document:
    This document doesn't exist yet. When it does, will it actually contain
    a possible solution, or are you just hopeful that it eventually might?

bullets (failure conditions):
    I think an application would check for the 3rd condition before the
    2nd, so I suggest you swap those two points.

para after bullets:
"the fragment located by the XPointer as a whole":
    This is the only place in the spec where "fragment" is used in this
    sense. (Everywhere else, it's in "fragment identifier".)  You should
    probably change it to "sub-resource" or "location-set".

-Michael Dyck

Received on Thursday, 7 September 2000 04:38:27 UTC