- From: Ashok Malhotra <ashokma@microsoft.com>
- Date: Fri, 17 Oct 2003 05:20:42 -0700
- To: "Xan Gregg" <xan.gregg@jmp.com>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, <public-qt-comments@w3.org>, "Kay, Michael" <Michael.Kay@softwareag.com>, "W3C XML Schema IG" <w3c-xml-schema-ig@w3.org>
Hi Xan: I agree that the %good, %bad is a problem but, rereading the Linking spec I found that it does not escape the % sign at all. From http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators "Some characters are disallowed in URI references, even if they are allowed in XML; the disallowed characters include all non-ASCII characters, plus the excluded characters listed in Section 2.4 of [IETF RFC 2396], except for the number sign (#) and percent sign (%) and the square bracket characters re-allowed in [IETF RFC 2732]. Disallowed characters must be escaped as follows:" So, the Linking spec removes % from the list of disallowed characters and so does not escape it. All the best, Ashok > -----Original Message----- > From: Xan Gregg [mailto:xan.gregg@jmp.com] > Sent: Wednesday, October 08, 2003 7:14 AM > To: Ashok Malhotra > Cc: C. M. Sperberg-McQueen; public-qt-comments@w3.org; Kay, Michael; W3C > XML Schema IG > Subject: Re: XML Schema WG comments on Functions and Operators > > From XML Schema comments, section 2.8: > >> In particular, some members of the XML Schema WG were surprised > >> to see > >> that your algorithm escapes the percent sign in some cases but not > >> others; this does not seem to be a feature of the algorithm given > >> by > >> XML Linking and by the Character Model. > > From Ashok: > > ...A little later RFC 2396 says > > > > " Because the percent "%" character always has the reserved purpose of > > being the escape indicator, it must be escaped as "%25" in order to > > be used as data within a URI." > > > > Our reading of this rule is that the % must be escaped unless it is > > the start of an escape sequence %HH. > > > > This reading of 2396 was the basis of the rule in the F&O which says > > > > ".... The PERCENT SIGN "%" character itself is escaped only if it is > > not followed by two hexadecimal digits (that is, 0-9, a-f and A-F)." > > I think the group's concern about percent was that the algorithm treats > all occurrences of %HH as pre-escaped characters which means that some > strings containing percent cannot be escaped by fn:escape-uri(). > Consider the two resource names: > > 10%GOOD.HTML > 10%BAD.HTML > > fn:escape-uri() will change the former to "10%25GOOD.HTML", but the > latter will remain unchanged and won't work when fed to some unescaping > processor. This is a pretty unlikely case, and maybe the F&O > intentionally does not handle it, preferring to assume that the > incoming string to the escape-uri function is already escaped to some > degree. (Maybe the F&O function should be called > "fn:escape-uri-further".) > > As I understand it, both the XML Linking specification and RFC 2396 > would have the percent converted to "%25" in both names of my example. > > xan >
Received on Friday, 17 October 2003 08:20:42 UTC