inclusion from Simon St.Laurent on 2000-05-25 (xml-uri@w3.org from May 2000)

From: Simon St.Laurent <simonstl@simonstl.com>
Date: Wed, 24 May 2000 22:45:48 -0400
To: xml-uri@w3.org
Message-Id: <200005250243.WAA30637@hesketh.net>
I keep coming up with ugly cases, and the complexity levels involved in
absolutizing relative URIs appear to be dazzling, though I'm not sure this
latest case is actually good for any side of the argument.

Inclusion, whether via external entities or via XInclude, raises some
painful questions regarding how/if absolutization of namespace URIs should
take place.  (There are also questions raised regarding absolutization of
other URIs, which I'm not sure have been addressed, but which affect specs
still in development, not year-old recs.)

Let's take the XML 'document' below:
<?xml encoding="UTF-8"?>
<mydoc xmlns="zippy/">

We'll put it at the URL http://www.simonstl.com/pinhead/mydoc.xml.

At some point in processing, the namespace URI gets absolutized to:
http://www.simonstl.com/pinhead/zippy/

Fine, whatever, maybe it's useful to someone, somewhere.

Next we create a document:
<?xml encoding="UTF-8"?>
<!DOCTYPE mydocs [
<!ENTITY mydoc SYSTEM "http://www.simonstl.com/pinhead/mydoc.xml">
]>
<mydocs>
&mydoc;
</mydocs>

We put that document at the URL http://www.simonstl.com/docs/mydocs.xml

After inclusion, we've got:
<?xml encoding="UTF-8"?>
<mydocs>
<mydoc xmlns="zippy/">
</mydocs>

So what's the 'absolutized' namespace for the mydoc element now?  Is the
base URI for the mydoc element mysteriously retained, giving us:
http://www.simonstl.com/pinhead/zippy/

Or is it lost, and the document's base URI used:
http://www.simonstl.com/docs/zippy/

Similarly, what happens in the document below?
<?xml encoding="UTF-8"?>
<mydocs xmlns:xinclude="..."><!--ns not defined in spec yet-->
<xinclude:include href="http://www.simonstl.com/pinhead/mydoc.xml"/>
</mydocs>

Here the XInclude even provides the base URI explicitly inside of the
document.

mydoc.xml is the same in every case, but the usage is different, and (I
think) the results undefined.  It's not clear how the rules for determining
base URI from 5.1 of RFC 2396 [1] apply to these cases, at least on my
reading.  

5.1.2 - Base URI from the Encapsulating Entity - seems most likely, but I'd
still like a strong clarification that such behavior is in fact appropriate
to namespace handling.  It seems more appropriate to determining where to
retrieve an image from than describing a vocabulary or even a 'language'.
For such cases, 5.1.3 seems more appropriate, or even fallback to the
non-answer of 5.1.4.

[1] http://www.ietf.org/rfc/rfc2396.txt



Simon St.Laurent
XML Elements of Style / XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Cookies / Sharing Bandwidth
http://www.simonstl.com
Received on Wednesday, 24 May 2000 22:43:51 UTC