RE: XML Base and XPath absolutizing of URIs from Paul Grosso on 2000-06-09 (www-xml-linking-comments@w3.org from April to June 2000)

From: Paul Grosso <pgrosso@arbortext.com>
Date: Thu, 08 Jun 2000 19:08:28 -0500
To: "John Boyer" <jboyer@PureEdge.com>, "Joseph M. Reagle Jr." <reagle@w3.org>
Cc: "XML DSig" <w3c-ietf-xmldsig@w3.org>, <elm@east.sun.com>, <www-xpath-comments@w3.org>, <www-xml-linking-comments@w3.org>, <Daniel.Veillard@w3.org>, <connolly@w3.org>
Message-Id: <3.0.32.20000608190752.00ff1a08@pophost.arbortext.com>

[I removed xml-uri from the distribution.]

At 16:33 2000 06 08 -0700, John Boyer wrote:
>Yes, absolutely no problem with XBase. ...

>As for whether XPath defines a method for specifying a base URI, it does
>not.  

You are right--I misread you to say that XPath doesn't specify
an absolutization algorithm, and I suggested it does by reference
to 2396.  Also by same reference, XPath assumes 2396 ways of
determining the base URI, but you are correct that it does not 
specify any way to do so via the document content (per section 
"5.1.1. Base URI within Document Content" of 2396).

>[XPath] says that a namespace declaration can be a URI reference, and
>that URI-references are defined by RFC2396.  The conversion from relative to
>absolute URIs is claimed to occur during namespace processing. The
>namespaces spec does not define this!  

You are right, it would be the Infoset that specifies this, and
the Infoset is stuck right now.

>Moreover, the problem with claiming
>that RFC 2396 defines how to do this is that RFC2396 only describes the
>rules for establishing a base URL for a document and how to convert from
>relative to absolute URI *given a base URL* (sections 5.1 and 5.2
>respectively).  There is nothing to say how an Xpath evaluation is supposed
>to receive the base URL.

I'm not sure what it means for XPath "to receive the base URL".  
XPath works on a data model that was described within XPath
only because the Infoset wasn't yet ready, but it was supposed
to match that of the Infoset.  The right thing to happen is for
the absolutized URI to be in the infoset and for XPath to work
off the infoset.  Then XPath doesn't need to concern itself
with this issue at all.

>Put another way, consider the following quote from the XPath Recommendation:
>
>"Expression evaluation occurs with respect to a context. XSLT and XPointer
>specify how the context is determined for XPath expressions used in XSLT and
>XPointer respectively. The context consists of:
>
>a node (the context node)
>a pair of non-zero positive integers (the context position and the context
>size)
>a set of variable bindings
>a function library
>the set of namespace declarations in scope for the expression"
>
>Where is the base URL in this input specification?

No where.  It shouldn't be.  Rather, "the set of namespace declarations 
in scope for the expression" should all be in already absolutized form.

>The only thing I can think of is that software external to an XPath
>implementation must know the base URL using the rules established by RFC
>2396.  Further, since there is no way to communicate the base URL to XPath,
>the external software must apply the relative-to-absolute conversion rules
>defined in RFC2396 to the data structures it creates in support of setting
>up the context node.  Therefore, by the time you get to code that is
>actually part of the XPath implementation, the namespace absolutization has
>already been done by the external code, and the XPath implementation just
>treats them like strings.

Precisely.

>Conclusion: Since there is no way defined by the XPath spec to provide the
>base URL as part of the initial evaluation context, there is no way for the
>XPath evaluation to enforce absolute URIs.  They're just strings to the
>XPath evaluator.  

Correct.

>Thus, the external, application-dependent code that must
>absolutize can also choose not to do it.  

Huh?  It's not application-dependent code, it would be the
underlying parser layer that generates the infoset, and it
can't choose how to do it, it has to do it however we decide
it gets done.

>Since XPath is in violation of the
>namespaces spec anyway for trying to absolutize URIs, the feature should be
>removed by an erratum.  

This is the issue up for discussion on xml-uri.  What XPath is doing
wrong is assuming that it has anything to do with absolutization 
instead of just relying on what's in the Infoset.  (But since XPath
was written before the Infoset, this isn't surprising.)

>Alternately, XPath could be modified by an erratum
>to indicate either that the base URL is provided by XBase or as an
>additional component of the evaluation context.

XPath should never need the base URI.

>One way or the other, something about XPath needs to be changed.

Once this namespace question is resolved, the Infoset can be
completed, and then XPath should probably be rewritten in terms
of the infoset.

paul

Received on Thursday, 8 June 2000 20:09:07 UTC