- From: Xan Gregg <xan.gregg@jmp.com>
- Date: Wed, 8 Oct 2003 10:14:27 -0400
- To: "Ashok Malhotra" <ashokma@microsoft.com>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, <public-qt-comments@w3.org>, "Kay, Michael" <Michael.Kay@softwareag.com>, "W3C XML Schema IG" <w3c-xml-schema-ig@w3.org>
From XML Schema comments, section 2.8:
>> In particular, some members of the XML Schema WG were surprised
>> to see
>> that your algorithm escapes the percent sign in some cases but not
>> others; this does not seem to be a feature of the algorithm given
>> by
>> XML Linking and by the Character Model.
From Ashok:
> ...A little later RFC 2396 says
>
> " Because the percent "%" character always has the reserved purpose of
> being the escape indicator, it must be escaped as "%25" in order to
> be used as data within a URI."
>
> Our reading of this rule is that the % must be escaped unless it is
> the start of an escape sequence %HH.
>
> This reading of 2396 was the basis of the rule in the F&O which says
>
> ".... The PERCENT SIGN "%" character itself is escaped only if it is
> not followed by two hexadecimal digits (that is, 0-9, a-f and A-F)."
I think the group's concern about percent was that the algorithm treats
all occurrences of %HH as pre-escaped characters which means that some
strings containing percent cannot be escaped by fn:escape-uri().
Consider the two resource names:
10%GOOD.HTML
10%BAD.HTML
fn:escape-uri() will change the former to "10%25GOOD.HTML", but the
latter will remain unchanged and won't work when fed to some unescaping
processor. This is a pretty unlikely case, and maybe the F&O
intentionally does not handle it, preferring to assume that the
incoming string to the escape-uri function is already escaped to some
degree. (Maybe the F&O function should be called
"fn:escape-uri-further".)
As I understand it, both the XML Linking specification and RFC 2396
would have the percent converted to "%25" in both names of my example.
xan
Received on Wednesday, 8 October 2003 10:15:26 UTC