Re: "Web addresses in HTML 5" for review (ISSUE-56 urls-webarch) from Julian Reschke on 2009-03-23 (public-html@w3.org from March 2009)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 23 Mar 2009 16:16:39 +0100
To: Dan Connolly <connolly@w3.org>
CC: public-html@w3.org
Message-ID: <49C7A7D7.8080106@gmx.de>

Hi Dan,

I'm a bit confused by:
> The parsing process described here should be more closely aligned with 
> the rules given in RFC 3987.
> 
>    1.
> 
>       Strip leading and trailing space characters <#space-character> from w.
> 
>    2.
> 
>       Percent-encode all non-URI characters in w.
> 
>       This probably needs to be laid out in more detail.
> 
>       Note: this step will replace all of the following characters with
>       a percent-encoded equivalent:
> 
>           * all characters with codepoints less than or equal to U+0020
>             (i.e. the C0 control characters)
>           * all characters with codepoints greater than or equal to
>             U+007% (i.e. U+007?F and all non-ASCII characters in the w)
>           * U+0022 double quotation mark
>           * U+0025 percent sign
>           * U+003C less-than sign
>           * U+003E greater-than sign mark
>           * U+005C reverse solidus (backslash)
>           * U+005E circumflex accent
>           * U+0060 grave accent
>           * U+007B left curly bracket
>           * U+007C vertical line
>           * U+007D right curly bracket
> 
>       As a result of percent-encoding the percent sign, any occurrences
>       of percent-encoding in the Web address will be double-encoded at
>       this step.

Why would you want that?

It seems to mean that if w includes "%20" (a properly escaped space 
character), it will be encoded into "%2520".

BR, Julian

Received on Monday, 23 March 2009 15:17:26 UTC