- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 18 Apr 2007 15:03:19 -0400
- To: John Cowan <cowan@ccil.org>
- Cc: Jeremy Carroll <jjc@hpl.hp.com>, semantic-web@w3.org, www-international@w3.org
> Jeremy Carroll scripsit: > > > Sandro Hawke wrote: > > >Of course, if you *want* the base end with "résumé" you're out of luck, > > >since XML Base [1] says you can only use a URI. But at least you've > > >avoided the dilemma. > > > > Yes I like using xml:base as much as possible. > > (And I think xml:base does allow non-ASCII chars since it tells > > applications how to % encode them) > > There are two different questions here: what characters can appear > in a [base URI] Infoset property, and what characters can appear > in an xml:base attribute value? > > The [base URI] property of a document, element, or PI is a URI; > as such, it can only make use of a limited repertoire, a subset > of ASCII characters. > > The value of an xml:base attribute is not so limited: it can contain > (almost) arbitrary Unicode, which is %-escaped before being used > to alter the base URI property of the element on which it appears > and the element's children. Percent-escaping has got to be among the 10 most confusing and confused subjects in the history of computing. :-) My sense is that the 2001 XML Base Recommendation [1] is very confused about how to handle percent-escaping. Of course, it long predated IRIs, so this isn't so surprising. There is a Proposed Edited Recommendation [2] which, to my mind, is much clearer about this. It says, essentially, don't do percent-escaping. XML is safe for Unicode, so just use Unicode. (As I understand it, this new draft is waiting to see what happens with HRRIs [3] before proceeding. HRRIs are one step past IRIs in also allowing the ASCII characters people use that IRIs don't allow, like " " and "<".) -- Sandro [1] http://www.w3.org/TR/xmlbase/ [2] http://www.w3.org/TR/2006/PER-xmlbase-20061220 [3] http://www.w3.org/XML/Group/2007/03/xmlresourceid/
Received on Wednesday, 18 April 2007 19:03:32 UTC