W3C home > Mailing lists > Public > public-media-fragment@w3.org > July 2010

Re: Media Fragments URI parsing: pseudo algorithm code

From: Yves Lafon <ylafon@w3.org>
Date: Thu, 1 Jul 2010 10:04:35 -0400 (EDT)
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
cc: Bjoern Hoehrmann <derhoermi@gmx.net>, Philip Jgenstedt <philipj@opera.com>, public-media-fragment@w3.org
Message-ID: <alpine.DEB.1.10.1007011003560.13235@wnl.j3.bet>
On Thu, 1 Jul 2010, Silvia Pfeiffer wrote:

> On Thu, Jul 1, 2010 at 10:47 PM, Yves Lafon <ylafon@w3.org> wrote:
>> On Thu, 1 Jul 2010, Bjoern Hoehrmann wrote:
>>
>>> * Yves Lafon wrote:
>>>>
>>>> On Wed, 30 Jun 2010, Bjoern Hoehrmann wrote:
>>>>
>>>>>> The disagreement here is only for which components to decode
>>>>>> percent-encoding, RFC3986 will not help us.
>>>>>
>>>>> RFC 3986 requires implementations when processing a fragment identifiers
>>>>> to treat %74 and "t" the same regardless of where either occurs, as "t"
>>>>> is not a reserved character and URIs that differ only in the escaping of
>>>>> unreserved characters are defined to be equivalent. So the answer here
>>>>> is "all components". You can only have special requirements for reserved
>>>>> characters when they occur unescaped.
>>>>
>>>> URI equivalence is an endlees source of fun :)
>>>> are http://www.example.com/ (1) and http://www.example.com:80/ (2) and
>>>> h%74ttp:www/example.com/ (3) equivalent ?
>>>> From what you say, at least (1) and (3) should be.
>>>
>>> Well, http://www.websitedev.de/temp/rfc3986-check.html.gz tells me (3)
>>> is neither a URI nor a URI-reference so the question does not arise. For
>>> (1) and (2) the answer is scheme-specific. Neither has a bearing on the
>>> case of fragment identifiers as they are scheme-independent and allow
>>> percent-encoding everywhere.
>>
>> (3) is not a URI because the ABNF doesn't allow percent encoding in the
>> scheme.
>> But rfc3986 2.4. When to Encode or Decode says:
>> <<
>> When a URI is dereferenced, the components and subcomponents
>>  significant to the scheme-specific dereferencing process (if any)
>>  must be parsed and separated before the percent-encoded octets within
>>  those components can be safely decoded, as otherwise the data may be
>>  mistaken for component delimiters.
>>>>
>> So far so good.
>> <<
>> The only exception is for
>>  percent-encoded octets corresponding to characters in the unreserved
>>  set, which can be decoded at any time.
>>>>
>> which is what you are referring to contradicts the fact that
>> h%74tp:www/example.com/ is not a valid URI
>>
>>
>
> I assume you are working on the basis that the name-value pairs that
> we define fall under the general understanding of sub-components in
> rfc3986? It can't be components, since they are defined in section 3
> as Scheme, Path, Quer, and Fragment. I further assume that because we
> use "=" as a subdelimiter, which is a reserved character, you regard
> the name and value as a sub-component, as described in 2.2?
>
> I think under these circumstances, it may indeed already be defined
> what needs to be percent-encoded and what not...
>
> However, I fail to see how h%74tp:www/example.com/ could ever be a
> valid URI, even given these circumstances.

I have big fingers today, I wanted to type h%74tp://www.example.com/ :)

-- 
Baroula que barouleras, au tiu toujou t'entourneras.

         ~~Yves
Received on Thursday, 1 July 2010 14:04:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 21 September 2011 12:13:39 GMT