qt-2004Feb0362-25, Limit URI escaping to non-ASCII characters

In [1], you submitted the following comment on the Last Call Working 
Draft of XSLT 2.0 and XQuery 1.0 Serialization on behalf of the I18N 
Working Group.

>>
[32] Section 5 and Section 6: Note starting: "This escaping is 
deliberately
   confined to non-ASCII characters,": There are certain ASCII characters
   that are not allowed in URIs. They should be escaped.
<<

The XSL and XQuery working groups have discussed your comment, and decline 

to act on it.   We endorse Mike Kay's response [2]:

The decision here is very deliberate, as the text says. Note that
appendix B.2.1 of the HTML 4.0 specification also refers to %HH escaping
only in connection with non-ASCII characters.

Although characters such as spaces are not allowed in URIs, if you
escape them in URIs that are interpreted client-side, such as
javascript: URIs, the URI stops working in most browsers. 

Also, you can't escape an id attribute that acts as the target of a
link, because % is not valid in an ID attribute. In practice (whatever
the spec says) if you escape the URI fragment identifier of a same-page
URI reference but don't escape the corresponding ID attribute, the
browser doesn't match them up. In fact, the evidence appears to be that
browsers don't unescape URIs at all, they leave this to be done at the
server. Escaping non-ASCII characters, as we currently specify, appears
to work for fragment identifiers referring to a different page, but not
for same-page references. It's a mess, which is one reason why we now
provide the option to switch off automatic escaping of URIs and allow
the user to do it themselves using the escape-uri() function.

Please let us know if the resolution to this issue is acceptable.

Joanne Tong


[1] 
http://www.w3.org/XML/Group/xsl-query-specs/last-call-comments/xquery-serialization/issues.xml#qt-2004Feb0362-25

[2] 
http://lists.w3.org/Archives/Public/public-qt-comments/2004Feb/0401.html

Received on Wednesday, 8 December 2004 21:02:14 UTC