Documents without a base URI from Simon St.Laurent on 2000-05-25 (xml-uri@w3.org from May 2000)

From: Simon St.Laurent <simonstl@simonstl.com>
Date: Wed, 24 May 2000 21:10:01 -0400
To: "xml-uri@w3.org" <xml-uri@w3.org>
Message-Id: <200005250108.VAA24507@hesketh.net>

At 04:44 PM 5/24/00 -0400, John Cowan wrote:
>Most documents *do* have base URIs, unless they arrive at the parser using
>a raw TCP socket, or on the standard input, or something like that.
>Documents which don't have base URIs can't usefully contain relative
>URI references of any sort, not just relative namespace names.

This issue grows more troubling the more I think about it.

While I've argued earlier that higher levels should have the same
information about base URI (and perhaps more) than lower levels, I'm not
sure what to do in cases where that information is simply unavailable or
nonexistent.

How do you absolutize a relative URI when there is no information about a
base URI?

I'm not sure that these cases "can't usefully contain... relative namespace
names" - in fact, I'm quite sure that they could.  

RFC 2396, 5.2 notes that:
>   only the scheme component is required to be present in the base
>   URI; the other components may be empty or undefined.  A component is
>   undefined if its preceding separator does not appear in the URI
>   reference; the path component is never undefined, though it may be
>   empty.

With no base URI, there will be no scheme component, and the lack of a path
component seems to lead to unpredictable results at best.

Yes, it's a borderline case, and one that I'm not sure exists today, but
it's one that seems quite plausible.  It also seems to be a case where
treating URI references as strings for purposes of namespace comparisons
seems legitimate but demanding absolutiziation is dangerous.

Simon St.Laurent
XML Elements of Style / XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Cookies / Sharing Bandwidth
http://www.simonstl.com

Received on Wednesday, 24 May 2000 21:08:07 UTC