RE: XPointer considered incomprehensible from Jonathan Marsh on 2006-09-04 (www-xml-linking-comments@w3.org from July to September 2006)

From: Jonathan Marsh <jmarsh@microsoft.com>
Date: Mon, 4 Sep 2006 15:19:10 -0700
To: "Bjoern Hoehrmann" <derhoermi@gmx.net>, <www-xml-linking-comments@w3.org>, <www-tag@w3.org>
Message-ID: <37D0366A39A9044286B2783EB4C3C4E803DC8B83@RED-MSG-10.redmond.corp.microsoft.com>
Interesting post.  I helped develop XPointer, so I included some comments defending the framework below.

Summary:
- XPointer Framework isn't fundamentally broken.
- Despite appearances, the SVG fragment syntax does not appear to 
  be a conforming XPointer application.
- There are clearly some problems with the XPointer registry.

> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of
> Bjoern Hoehrmann
> Sent: September 1, 2006 11:52 AM
> To: www-xml-linking-comments@w3.org
> Subject: XPointer considered incomprehensible
> 
> 
> Dear XML Core Working Group,
> 
>   I read the three XPointer Recommendations and the XPointer Registry
> policy and was unable to make much sense out of them. The documents are:
> 
>   [1] http://www.w3.org/TR/2003/REC-xptr-framework-20030325/
>   [2] http://www.w3.org/TR/2003/REC-xptr-element-20030325/
>   [3] http://www.w3.org/TR/2003/REC-xptr-xmlns-20030325/
>   [4] http://www.w3.org/2005/04/xpointer-policy.html
> 
> The first problem is that the XPointer Framework relies heavily on the
> concept of "subresources", or more specifically, XML subresource con-
> tained within XML resources. None of the documents makes any attempt to
> define this term, and the term is used in contradictory ways; an example
> is the following contradiction.

The XPointer Framework is intentionally vague about what constitutes a sub-resource, in order to delegate that determination to each scheme.  For barenames, there is an implication that the subresource would be an element information item, as that is the only thing identified by such a pointer.  I suppose it wouldn't hurt to spell this out more clearly.

I also agree the element() scheme doesn't precisely define the term "subresource" in explicit terms either, though again the implication is that in this context a subresource consists of zero or one element information items.

> The XPointer Framework specifies that "If no pointer part identifies
> subresources, it is an error"; in contrast, for the element() scheme it
> is specified that "... except that failure to identify an element
> results simply in no subresource being identified by this pointer part
> rather than an XPointer Framework error." There is no provision in the
> Framework specification that individual schemes can override core
> aspects of the Framework, which implies that one of the specifications
> is in error. It is unclear to me which specification.

That does seem a little funky, though clearly defining subresource wouldn't necessarily de-funkify it.  I can't recall at this point why we put that in, but it was IIRC intentional.

> I note that there are registered XPointer schemes that cannot reasonably
> be considered to identify XML subresources in XML resources. An example
> is the svgView scheme originally defined in the SVG 1.1 Recommendation.
> It is entirely unclear how such a scheme fits into the XPointer Frame-
> work, the very definition of "XPointer processor" is "A software
> component that identifies subresources of an XML resource by applying a
> pointer to it."  An implementation of only the 'svgView' scheme does not
> do that in any way, which implies it is not an XPointer processor. What,
> then, is it, and why is 'svgView' still an XPointer scheme?

I won't speak for SVG, but I do defend the right of scheme creators to define "subresource" in their own context.  One need not be limited to XML information items, but one can define other things like ranges (xpointer scheme) or abstract components (wsdl.* schemes).

> We can take this one step further and construct the following resource
> identifier:
> 
>   http://example.org/example.svg#svgView(scale(0.5))element(foo)
> 
> Considering a software module that is both an SVG Viewer and a XPointer
> processor, it is unclear to me how this resource identifier is to be
> processed, and which specification would be responsible to define this.

Fragment syntaxes need to be evaluated in the context of a media type.  In looking more closely, I see that the definition of SVG fragment identifiers doesn't seem to allow for extended or multiple XPointer schemes.  Your example would not AFAICT be a valid fragment identifier for image/svg+xml.  Despite the syntactical similarity, SVG fragment identifiers apparently don't conform to the XPointer Framework.  They've profiled away the extensibility mechanism which is 90% of the framework.  So while svgView could be used legally as an XPointer scheme, it apparently can't be considered such within the context of image/svg+xml.  There doesn't seem to be any claim that the fragment syntax is XPointer compatible, so the wrong impression must simply come from the registration as an XPointer scheme, where it's utility within XPointer appears largely useless.

> According to the XPointer Framework, the svgView scheme must be ignored

The spec doesn't say that.  It says "If the XPointer processor does not support the scheme used in a pointer part, it skips that pointer part."

If your example were to be associated with the application/xml media type, the svgView scheme is most likely (but not necessarily!) not supported, and the processor will therefore skip that part.

> and the fragment identifier identifies the element with id 'foo'. Like-
> wise, if we have
> 
>   http://example.org/example.svg#svgView(scale(0.5))
> 
> The svgView part must be ignored, and as no subresource is identified,
> the XPointer processor will report an error. It therefore appears to be
> impossible to construct a software module that conforms to both XPointer
> and SVG, making these two specifications genuinely incompatible.

As I said, I don't think SVG conforms to XPointer, but that appears to be a choice/bug in svgView rather than a fundamental problem with XPointer.  That they have muddied the waters by reusing a similar (but subtly different) syntax and registered the svgView scheme as if it were likely to be used in an XPointer Framework-compatible way is certainly unfortunate.

> The next problem is that it is, as seen above, entirely unclear what re-
> quirements new XPointer schemes must meet. The implied requirement, that
> the scheme identifies "subresources", is obviously ridiculous,

Actually, not all schemes must identify subresources.  Xmlns() is an example of an administrative scheme that never identifies a subresource.  I don't think there is an implied requirement, and I don't think use of the term "subresource" as a means to delegate the kind of thing being identified to a scheme specification is at all broken.

> and there
> is only one explicit requirement in the Framework specification, namely
> "The documentation for every scheme must specify whether it uses the
> namespace binding context." The registration policy does not cite this
> requirement and schemes that fail to meet this requirement have been
> registered in the past, it then appears that objections to proposed new
> schemes can only cite non-technical arguments, making the whole review
> process a rather silly undertaking.

There is a difference between a legal XPointer scheme and a media type registration that builds on the XPointer framework.  The case of svgView shows that it would be nice to confirm that a scheme is actually useful as an XPointer before registering it.  This is a problem in inadequately reviewing SVG as much as in reviewing the svgView scheme in isolation.

> Silly in particular because this will inevitably cause a situation where
> important names are taken for schemes that lack adequate specifications,
> indeed, in addition to the problem mentioned above, a broad range of
> schemes with only unstable draft documentation has already been
> registered.

Yes, some are way underspecified.  While I'm not sure the Registry is about making sure each scheme is of high quality, it's simply about avoiding name conflicts, it would be nice to have a requirement that there is a link to a spec (otherwise, reserving the name serves no purpose, does it?)

I also see all the WSDL ones (which my own WG is responsible for) have been marked as registered despite the policy stating that the corresponding spec needs to be in PR first (which it's not.)

> I note that the lack of requirements include syntactic requirements, it
> is unclear what to make of the specification in the light of the 1.1
> versions of XML and Namespaces in XML, combined with the lack of error
> handling requirements for syntax errors, and the complete lack of up
> to date discussion on how to name new schemes.

Personally, I think XML 1.1's impact on other specs is its problem, not the specs, so I won't address that one.  I can't think of meaningful universal requirements for syntax errors, so you'll have to provide more detail on that one before I comment.  And I'm personally glad that there are no limits (besides those imposed by the framework) on how to choose a name - what possible purpose could that solve?  I don't tell you how to name new XML elements, schema types, etc.

> The next problem is a core claim of the framework that it can be applied
> to external parsed entities. The framework specification fails to make
> any reasonable effort in defining how this is supposed to work, it says,
> as an aside,
> 
>   Note that if the XML resource is not a document but rather an external
>   parsed entity, this property will not be reported. Rather, the
>   information set is effectively extended to report the one or more
>   top-level elements in the entity as ordered "root element" properties
>   for the entity.
> 
> It is entirely unclear what to make of this; considering an external
> parsed entity like http://example.net/example.ent
> 
>   <?xml version="1.0"?>
>   Hello
>   <foo:bar/>
>   World
>   <foo:bar/>
>   !
>   <!-- EOF -->
> 
> There is no way to know what the "extended Infoset" might look like, the
> XML Infoset Recommendation does not apply to external parsed entities
> and the informal text cited above does not say anything about handling
> the lack of namespace declarations, white space, or how to construct the
> 'children' property of the Document Information Item. I even fail to see
> how the addition of '"root element" properties' makes any sense here.
>
> If we now consider http://example.net/example.ent#xpath2(...) where the
> entire specification of the xpath2 scheme is given as
> 
>   Locates a node or node set within an XML Information Set. The
>   single argument is an XPath path as defined in the W3C XPath 2
>   Recommendation. The node or node set resulting from evaluating
>   the XPath is the reference. Note that in some contexts it is
>   an error if a node set (rather than a single node) is returned.
> 
> I do not understand the first sentence for I do not know whether it
> refers to the Infoset as defined in the Infoset specification, or the
> incomprehensibly extended Infoset as explained above. In case of the
> latter, I have no idea how to construct the XPath data model for the
> example external parsed entity given above. The second sentence does
> not make sense either, there is no W3C XPath 2 Recommendation, and I
> am not sure whether this refers to a PathExpr or a Expr, or something
> else, as defined in the XPath 2.0 CR. The last sentence is gibberish.
> Worse, there is no way of knowing how to process e.g.
> 
>   http://example.net/example.ent#xpath2(document(...)/*)
> 
> or
> 
>   http://example.net/example.ent#xpath2(fn:doc(...)/*)
> 
> or even
> 
>   http://example.net/example.ent#xmlns(x=urn:example)xpath2(//x:y)
> 
> because, contrary to the requirements for pointer documentations,
> there is no discussion of the namespace binding context, or in case
> of the first example, any discussion of the exact processing model.

Making a link to the specification optional seems like a huge mistake for a registry!

> There are several more issues here, but considering how the XML Core
> Working Group usually handles my comments it does not make much sense
> to write those up at this point. The XPointer specifications and the
> registry are too broken to derive any practical benefit from them
> anyway.
>
> regards,
> --
> Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
> Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
> 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
>
Received on Monday, 4 September 2006 22:19:54 UTC