W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > April 2008

Re: Changing base URIs

From: Norman Walsh <ndw@nwalsh.com>
Date: Thu, 10 Apr 2008 09:44:04 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <m2prsxdduj.fsf@nwalsh.com>
/ Richard Tobin <richard@inf.ed.ac.uk> was heard to say:
|> I've been thinking about the extension functions proposal though, and
|> I don't see how it can work. All of our variables are strings. What
|> arguments does the function take?
|
| I don't think this is a problem: the argument is a node, presumably
| coming down a pipe to p:variable or p:with-option.  Typically it
| will be / (or ., or omitted) to get the base uri of the document:

Yes, that was a brain cramp on my part. Nevermind.

|   <p:with-option name="base" select="p:base-uri(/)">
|     <p:pipe .../>
|   </p:with-option>
|
| The first couple of function are straightforward:
|
|   p:base-uri(node n) or p:base-uri()
|     returns (as a string) the base URI of its argument n, or of the
|     context node if no argument is given.  It is an error [what kind?]
|     if the argument is not a node.
|
| We probably need to say what the base URI is of the "empty document
| nodes" that are the context node in some circumstances.  And is it
| possible for an implementation to be processing a document for which
| it does not have a base URI?  (RFC 2396 and presumably its successors
| make the base URI application dependent in some circumstances.)
|
|   p:resolve-uri(string rel, string base)
|   p:resolve-uri(string rel)
|     returns (as a string) the result of resolving the URI reference
|     argument rel against the absolute URI reference argument base.
|     If only one argument is given, it is resolved against the
|     base URI of the context node.
|     It is an error [what kind?] if either argument is not a valid
|     URI reference, or if base is not absolute, or an error occurs
|     in the resolution process.
|
| These functions correspond to XPath 2's similarly named functions.
|
| I suggested the use of a relativize function to obtain the final
| path component of a URI, but this only works if you have the directory
| part, and I'm not sure why I was thinking that would be easy.
| (Note that the base URI of http://example.com/foo/bar.html is itself,
| not http://example.com/foo/)
|
| Does anyone know of an existing library that provides a function
| to get the last component?  In any case, it seems as if it might
| be dangerous to use it indiscrimately - what if the URL has an
| empty last component?

Java's URI class includes a 'relativize' method:

  http://java.sun.com/j2se/1.5.0/docs/api/java/net/URI.html#relativize(java.net.URI)

That might be sufficient. If you know the base URI, you can strip it
out. If you don't know it, or if you're wrong, then you get the whole
URI.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Birds are taken with pipes that imitate
http://nwalsh.com/            | their own voices, and men with those
                              | sayings that are most agreeable to
                              | their own opinions.--Samuel Butler

Received on Thursday, 10 April 2008 13:45:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 10 April 2008 13:45:15 GMT