W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > March 2006

Re: John Cowan pushes back on IRIs in [baseURI]

From: Richard Tobin <richard@inf.ed.ac.uk>
Date: Mon, 6 Mar 2006 15:57:56 +0000 (GMT)
To: John Cowan <cowan@ccil.org>, public-xml-core-wg@w3.org
Message-Id: <20060306155756.B38EF5B3C2E@macintosh.inf.ed.ac.uk>

> While I am very much in favor of allowing IRIs in xml:base attributes,

They already are; see http://www.w3.org/TR/xmlbase/#escaping.

> I believe that it would be improper to expose them as IRIs in the
> Infoset's [baseURI] property.

The Infoset already allows for the value to contain non-URI
characters: it says "The value of these properties does not reflect
any URI escaping that may be required for retrieval of the resource".

> This would in effect be a redefinition
> of the term "base URI", and there is no reason for it, since any IRI
> can be readily transformed into a URI.

On the other hand, I don't see any harm in it.

> Since the Infoset is abstract, this does not of course prevent any
> concrete API or protocol from exposing a base URI in IRI form when
> possible.

Some at least already do.

I think amending the Infoset in this way will make it clear that this
is the recommended approach.  And various specs refer to the [base
URI] property (e.g. the XPath 2 Data Model) so what we put in the
Infoset will affect implementations.

> Also, where are XML Resource Identifiers explained,

XML Resource Identifiers are the proposed term to replace the wording
(originally copied from XLink) that appears in many XML-related
specifications.  We plan to put it into the next editions of XML 1.x
and then refer to it from other specs.  You will find some versions
of the proposed wording in recent minutes I think.

> and do they allow
> any or all of "<", ">", '"', space, "{", "}", "|", "\", "^", and "`"
> (which cannot appear in IRIs)?

Yes, all of them.  That's the point of the term "XML Resource
Identifier": to give a name to this commonly used thing.

Making [base URI] be an XML Resource Identifier is consistent with the
recommendation that escaping should be performed as late as possible
because it is not reversible.

-- Richard
Received on Monday, 6 March 2006 15:58:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:33 GMT