Re: xml:base (was Re: IRI meets RDF meets HTTP redirect)

> Jeremy Carroll scripsit:
> 
> > Sandro Hawke wrote:
> > >Of course, if you *want* the base end with "résumé" you're out of luck,
> > >since XML Base [1] says you can only use a URI.   But at least you've
> > >avoided the dilemma.
> > 
> > Yes I like using xml:base as much as possible.
> > (And I think xml:base does allow non-ASCII chars since it tells 
> > applications how to % encode them)
> 
> There are two different questions here: what characters can appear
> in a [base URI] Infoset property, and what characters can appear
> in an xml:base attribute value?
> 
> The [base URI] property of a document, element, or PI is a URI;
> as such, it can only make use of a limited repertoire, a subset
> of ASCII characters.
> 
> The value of an xml:base attribute is not so limited: it can contain
> (almost) arbitrary Unicode, which is %-escaped before being used
> to alter the base URI property of the element on which it appears
> and the element's children.

Percent-escaping has got to be among the 10 most confusing and confused
subjects in the history of computing.   :-)

My sense is that the 2001 XML Base Recommendation [1] is very confused
about how to handle percent-escaping.  Of course, it long predated IRIs,
so this isn't so surprising.

There is a Proposed Edited Recommendation [2] which, to my mind, is much
clearer about this.  It says, essentially, don't do percent-escaping.
XML is safe for Unicode, so just use Unicode.  (As I understand it, this
new draft is waiting to see what happens with HRRIs [3] before
proceeding.  HRRIs are one step past IRIs in also allowing the ASCII
characters people use that IRIs don't allow, like " " and "<".)

    -- Sandro

[1] http://www.w3.org/TR/xmlbase/
[2] http://www.w3.org/TR/2006/PER-xmlbase-20061220
[3] http://www.w3.org/XML/Group/2007/03/xmlresourceid/

Received on Wednesday, 18 April 2007 19:03:31 UTC