W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

Re: xml:base

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Sat, 21 Apr 2007 00:35:04 +0200
To: www-international@w3.org
Message-ID: <46294018.7929@xyzzy.claranet.de>

Chris Lilley wrote:

>> Of course, if you *want* the base end with "résumé" you're out of luck,
>> since XML Base [1] says you can only use a URI.
> No, it doesn't.
>   The attribute xml:base may be inserted in XML documents to specify a
>   base URI other than the base URI of the document or external entity.
>   The value of this attribute is interpreted as a URI Reference as
>   defined in RFC 2396 [IETF RFC 2396], after processing according to
>   Section 3.1.

> and section 3.1 says
>   The set of characters allowed in xml:base attributes is the same as
>   for XML, namely [Unicode]. However, some Unicode characters are
>   disallowed from URI references, and thus processors must encode and
>   escape these characters to obtain a valid URI reference from the
>   attribute value.

That's obviously a bug.  It's not only "some" Unicode characters that
are disallowed in URI references, it's more like 99.991% that aren't
allowed as per STD 66.  

> The attribute value can be an IRI. The IRI represents a URI.

IMO you got that upside down.  Any IRI can be represented as URI using
percent-encoded UTF-8 (or punycoded internationalized domain names in
URI schemes with an <authority> as defined in STD 66).  

If some obscure xml:base draft based on another xlink11 draft tries to
redefine STD 66, claiming that raw IRIs are URIs, it's just wrong.
The xlink11 statement that HTML uses IRIs "as its locator technology"
is also wrong, but at least this draft has it apparently clear that
it's talking about IRIs and RFC 3987, not about URIs and STD 66.

Probably xml:base has to be updated to support IRIs, for protocols
like xmpp and formats like atom.  I can't tell if they use xml:base,
but they certainly use IRIs.  But just pretending that an STD 66 URI
is a kind of strange synonym for IRI is madness, it's not.  It would
break all implementations, protocols, and formats expecting URIs.

Received on Friday, 20 April 2007 22:45:14 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:28 UTC