W3C home > Mailing lists > Public > uri@w3.org > January 2010

Re: When is percent-encoding required.

From: Charles Lindsey <chl@clerew.man.ac.uk>
Date: Fri, 15 Jan 2010 17:16:30 -0000
To: URI <uri@w3.org>
Message-ID: <op.u6k3lshs6hl8nm@clerew.man.ac.uk>
On Wed, 13 Jan 2010 18:09:50 -0000, Julien …LIE <julien@trigofacile.com>  
wrote:

> Hi Charles,
>
>> Here is the wording I now propose:
>>
>> According to [RFC 3968], characters that are in <gen-delims> (a subset  
>> of  <reserved>) MUST be percent-encoded (though it is not wrong to  
>> encode  others). Specifically, the characters allowed in <msg-id-core>  
>> that must  be encoded are
>>     "/"  "?"  "#"  "[" and "]"
>> Note that an agent which seeks to interpret a 'news' URI needs to  
>> decode  all these percent-encoded characters before passing it on to an  
>> NNTP  server to be acted upon.
>>
>> Comments anyone?
>
> MUSTn't "%" also be encoded?

Ah yes! That pesky '%' which, for some strange reason, is not included in  
<gen-delims>
>
> I see in to-be RFC 5538:
>
>      mid-left        = 1*( mid-atext / "." ) /      ; <dot-atom-text>
>                        ( "%22" mid-quote "%22" )    ; <no-fold-quote>
>      mid-right       = 1*( mid-atext / "." ) /      ; <dot-atom-text>
>                        ( "%5B" mid-literal "%5D" )  ; <no-fold-literal>
>      mid-atext       = ALPHA / DIGIT /              ; RFC 2822 <atext>
>                        "!" / "$" / "&" / "'" /      ; allowed sub-delims
>                        "*" / "+" / "=" /            ; allowed sub-delims
>                        "-" / "_" / "~" /            ; allowed unreserved
>                        "%23" / "%25" / "%2F" /      ; "#" / "%" / "/"
>                        "%3F" / "%5E" / "%60" /      ; "?" / "^" / "`"
>                        "%7B" / "%7C" / "%7D"        ; "{" / "|" / "}"
>
well the final form of RFC 5538 is reverting to the <msg-id-core> syntax  
of RFC 5537. So the cases we are actually interested in is the  
intersection of (<gen-delims> plus '%') with <atext>. But that indeed does  
inlcude '%'.

> but if I have a message-ID that contains "%23", isn't is mandatory to
> convert it into "%2523" (URI)?

But of course "%23" is not in <atext>, whatever nonsense we might have had  
in <mid-atext>.

So here is another attempt at my wording:

According to [RFC 3968], characters that are in <gen-delims> (a subset
of  <reserved>), together with the character "%", MUST be percent-encoded  
(though it is not wrong to encode  others). Specifically, the characters  
allowed in <msg-id-core>
that must  be encoded are
     "/"  "?"  "#"  "[" "]" and "%"
Note that an agent which seeks to interpret a 'news' URI needs to
decode  all these percent-encoded characters before passing it on to an
NNTP  server to be acted upon.

-- 
Charles†H.†Lindsey†---------At†Home,†doing†my†own†thing------------------------
Tel:†+44†161†436†6131†                      
†††Web:†http://www.cs.man.ac.uk/~chl
Email:†chl@clerew.man.ac.uk††††††Snail:†5†Clerewood†Ave,†CHEADLE,†SK8†3JU,†U.K.
PGP:†2C15F1A9††††††Fingerprint:†73†6D†C2†51†93†A0†01†E7†65†E8†64†7E†14†A4†AB†A5
Received on Friday, 15 January 2010 17:17:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:13 UTC