Re: Percent encoding

On Thu, Mar 4, 2010 at 1:32 PM, Philip Jägenstedt <philipj@opera.com> wrote:
> On Wed, 03 Mar 2010 21:31:06 +0800, Yves Lafon <ylafon@w3.org> wrote:
>
>> On Tue, 2 Mar 2010, Raphaël Troncy wrote:
>>
>>> Dear Philip,
>>>
>>>> Perhaps YouTube decodes first and splits last, or perhaps they just use
>>>> a regexp to find v=XXXXX anywhere. Whatever is the case with YouTube, I
>>>> assume we want to match as closely as possible how query strings works
>>>> in e.g. ASP, PHP, JSP and Perl CGI, or there is no benefit in using
>>>> something that resembles query strings.
>>>>  We can never be 100% compatible, for reasons listed in a note after
>>>>
>>>> http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-spec/#decode-a-percent-encoded-string
>>>
>>> Thanks, the note is indeed really useful. For all the following
>>> statements, do you think it is possible to indicate a suitable reference?
>>>   *  "&" is the only primary separator for name-value pairs, but some
>>> server-side languages also treat ";" as a separator.
>>>   * name-value pairs with invalid percent-encoding should be ignored, but
>>> some server-side languages silently mask such errors.
>>>   * The "+" character should not be treated specially, but some
>>> server-side languages replace it with a space (" ") character.
>>
>> + is in sub-delims, along with & ; and others
>> (cf rfc3986)
>
> I tried looking at http://www.ietf.org/rfc/rfc3986.txt but can't figure out
> if sub-delims is relevant or not. It's indirectly part of the definition of
> fragment and query, but I can't see the spec saying anything special about
> it otherwise. It isn't related to what we (can) treat as separators, right?

I think it just means that these characters are available for use as
sub-delimiters in a fragment or query. What it delimits is up to the
spec to define, i.e. I think we can do with it what we want, as long
as we treat it as a sub-delimiter.

A '+' in a string where it encodes the blank delimits words. So, it
can be used in the way that some Web servers use it. But maybe we want
to discourage that use and rather encourage ppl to percent-encode
blanks.

I think that's as far as we can take it.

Cheers,
Silvia.

Received on Thursday, 4 March 2010 03:05:05 UTC