RE: XML Schema WG comments on Functions and Operators

Hi Xan:
I agree that the %good, %bad is a problem but, rereading the Linking
spec I found that it does not escape the % sign at all.  From 
http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators

"Some characters are disallowed in URI references, even if they are
allowed in XML; the disallowed characters include all non-ASCII
characters, plus the excluded characters listed in Section 2.4 of [IETF
RFC 2396], except for the number sign (#) and percent sign (%) and the
square bracket characters re-allowed in [IETF RFC 2732]. Disallowed
characters must be escaped as follows:"

So, the Linking spec removes % from the list of disallowed characters
and so does not escape it.

All the best, Ashok

> -----Original Message-----
> From: Xan Gregg [mailto:xan.gregg@jmp.com]
> Sent: Wednesday, October 08, 2003 7:14 AM
> To: Ashok Malhotra
> Cc: C. M. Sperberg-McQueen; public-qt-comments@w3.org; Kay, Michael;
W3C
> XML Schema IG
> Subject: Re: XML Schema WG comments on Functions and Operators
> 
>  From XML Schema comments, section 2.8:
> >>     In particular, some members of the XML Schema WG were surprised
> >> to see
> >>     that your algorithm escapes the percent sign in some cases but
not
> >>     others; this does not seem to be a feature of the algorithm
given
> >> by
> >>     XML Linking and by the Character Model.
> 
>  From Ashok:
> > ...A little later RFC 2396 says
> >
> > " Because the percent "%" character always has the reserved purpose
of
> >    being the escape indicator, it must be escaped as "%25" in order
to
> >    be used as data within a URI."
> >
> > Our reading of this rule is that the % must be escaped unless it is
> > the start of an escape sequence %HH.
> >
> > This reading of 2396 was the basis of the rule in the F&O which says
> >
> > ".... The PERCENT SIGN "%" character itself is escaped only if it is
> > not followed by two hexadecimal digits (that is, 0-9, a-f and A-F)."
> 
> I think the group's concern about percent was that the algorithm
treats
> all occurrences of %HH as pre-escaped characters which means that some
> strings containing percent cannot be escaped by fn:escape-uri().
> Consider the two resource names:
> 
>     10%GOOD.HTML
>     10%BAD.HTML
> 
> fn:escape-uri() will change the former to "10%25GOOD.HTML", but the
> latter will remain unchanged and won't work when fed to some
unescaping
> processor.  This is a pretty unlikely case, and maybe the F&O
> intentionally does not handle it, preferring to assume that the
> incoming string to the escape-uri function is already escaped to some
> degree. (Maybe the F&O function should be called
> "fn:escape-uri-further".)
> 
> As I understand it, both the XML Linking specification and RFC 2396
> would have the percent converted to "%25" in both names of my example.
> 
> xan
> 

Received on Friday, 17 October 2003 08:20:42 UTC