Re: xml:base (was Re: IRI meets RDF meets HTTP redirect)

On Wednesday, April 18, 2007, 9:03:19 PM, Sandro wrote:

>> The value of an xml:base attribute is not so limited: it can contain
>> (almost) arbitrary Unicode, which is %-escaped before being used
>> to alter the base URI property of the element on which it appears
>> and the element's children.

SH> Percent-escaping has got to be among the 10 most confusing and confused
SH> subjects in the history of computing.   :-)

This is why its better if computers do it, and humans see the real characters.

SH> My sense is that the 2001 XML Base Recommendation [1] is very confused
SH> about how to handle percent-escaping.  Of course, it long predated IRIs,
SH> so this isn't so surprising.

I agree that the newer PER is clearer.

SH> There is a Proposed Edited Recommendation [2] which, to my mind, is much
SH> clearer about this.  It says, essentially, don't do percent-escaping.
SH> XML is safe for Unicode, so just use Unicode.

Which is pretty much what

  The set of characters allowed in xml:base attributes is the same as
  for XML, namely [Unicode]. However, some Unicode characters are
  disallowed from URI references, and thus processors must encode and
  escape these characters to obtain a valid URI reference from the
  attribute value.

says. The improvement in the PER is to clarify that the 'processor' is
the software which reads the XML attribute value and constructs a URI
to fetch; not, as it could be read, the software which creates the XML
document.


-- 
 Chris Lilley                    mailto:chris@w3.org
 Interaction Domain Leader
 Co-Chair, W3C SVG Working Group
 W3C Graphics Activity Lead
 Co-Chair, W3C Hypertext CG

Received on Thursday, 19 April 2007 11:20:55 UTC