W3C home > Mailing lists > Public > xml-uri@w3.org > May 2000

RE: inclusion

From: Jonathan Marsh <jmarsh@microsoft.com>
Date: Thu, 25 May 2000 07:44:10 -0700
Message-ID: <116DFD732FA92E4D9B647C8EEF6DAF1015E201@red-pt-02.redmond.corp.microsoft.com>
To: "'Simon St.Laurent'" <simonstl@simonstl.com>, xml-uri@w3.org
> -----Original Message-----
> From: Simon St.Laurent [mailto:simonstl@simonstl.com]

> Let's take the XML 'document' below:
> <?xml encoding="UTF-8"?>
> <mydoc xmlns="zippy/">
> 
> We'll put it at the URL http://www.simonstl.com/pinhead/mydoc.xml.

> Next we create a document:
> <?xml encoding="UTF-8"?>
> <!DOCTYPE mydocs [
> <!ENTITY mydoc SYSTEM "http://www.simonstl.com/pinhead/mydoc.xml">
> ]>
> <mydocs>
> &mydoc;
> </mydocs>
> 
> We put that document at the URL 
> http://www.simonstl.com/docs/mydocs.xml
> 
> After inclusion, we've got:
> <?xml encoding="UTF-8"?>
> <mydocs>
> <mydoc xmlns="zippy/">
> </mydocs>
> 
> So what's the 'absolutized' namespace for the mydoc element 
> now?  Is the
> base URI for the mydoc element mysteriously retained, giving us:
> http://www.simonstl.com/pinhead/zippy/

Absolutizing presumeably would use the base URI property of the "mydoc"
element information item.  The infoset draft says that the uri of the
exernal entity is used, so yes, http://www.simonstl.com/pinhead/zippy/ is
the URI.

> Or is it lost, and the document's base URI used:
> http://www.simonstl.com/docs/zippy/

Not according to the infoset.

> Similarly, what happens in the document below?
> <?xml encoding="UTF-8"?>
> <mydocs xmlns:xinclude="..."><!--ns not defined in spec yet-->
> <xinclude:include href="http://www.simonstl.com/pinhead/mydoc.xml"/>
> </mydocs>

XInclude currently retains the base URI property of included items, so the
absolutized namespace also remains at
http://www.simonstl.com/pinhead/zippy/.

> mydoc.xml is the same in every case, but the usage is 
> different, and (I
> think) the results undefined.

I think it's pretty well defined by the infoset and XInclude drafts
respectively, and consistent either for retrieval of the URI, or (if we
decide to go that way) absolutization for identity purposes.

However, there is a similar case involving XML Base which bypasses the
special treatment of external entities and illustrates the danger:

  <?xml encoding="UTF-8"?>
  <!DOCTYPE mydocs [
    <!ENTITY mydoc '<mydoc xmlns="zippy/">'>
  ]>
  <mydocs>
    <docgroup xml:base="foo">&mydoc;</docgroup>
    <docgroup xml:base="bar">&mydoc;</docgroup>
  </mydocs>

The namespace of the first mydoc element is
http://www.simonstl.com/pinhead/foo/zippy/, and the namespace of the second
is http://www.simonstl.com/pinhead/bar/zippy/.  It's a comletely different
element! This isn't anything new - similar tricks are possible by mixing
prefixes and DTDs, but it just shows that (with xml:base) you don't need to
physically move your documents around to change their meaning.  I haven't
thought through it all yet (and I hope I'm spared this ordeal) but
absolutization of namespace URIs makes me wonder if XML Base, and XInclude
which relies on it, are viable if we accept absolutization.

- Jonathan Marsh
Received on Thursday, 25 May 2000 10:58:14 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:32:42 UTC