Re: Media Fragments URI parsing: pseudo algorithm code

On Thu, 1 Jul 2010, Silvia Pfeiffer wrote:

> On Thu, Jul 1, 2010 at 10:47 PM, Yves Lafon <ylafon@w3.org> wrote:
>> On Thu, 1 Jul 2010, Bjoern Hoehrmann wrote:
>>
>>> * Yves Lafon wrote:
>>>>
>>>> On Wed, 30 Jun 2010, Bjoern Hoehrmann wrote:
>>>>
>>>>>> The disagreement here is only for which components to decode
>>>>>> percent-encoding, RFC3986 will not help us.
>>>>>
>>>>> RFC 3986 requires implementations when processing a fragment identifiers
>>>>> to treat %74 and "t" the same regardless of where either occurs, as "t"
>>>>> is not a reserved character and URIs that differ only in the escaping of
>>>>> unreserved characters are defined to be equivalent. So the answer here
>>>>> is "all components". You can only have special requirements for reserved
>>>>> characters when they occur unescaped.
>>>>
>>>> URI equivalence is an endlees source of fun :)
>>>> are http://www.example.com/ (1) and http://www.example.com:80/ (2) and
>>>> h%74ttp:www/example.com/ (3) equivalent ?
>>>> From what you say, at least (1) and (3) should be.
>>>
>>> Well, http://www.websitedev.de/temp/rfc3986-check.html.gz tells me (3)
>>> is neither a URI nor a URI-reference so the question does not arise. For
>>> (1) and (2) the answer is scheme-specific. Neither has a bearing on the
>>> case of fragment identifiers as they are scheme-independent and allow
>>> percent-encoding everywhere.
>>
>> (3) is not a URI because the ABNF doesn't allow percent encoding in the
>> scheme.
>> But rfc3986 2.4.  When to Encode or Decode says:
>> <<
>> When a URI is dereferenced, the components and subcomponents
>>   significant to the scheme-specific dereferencing process (if any)
>>   must be parsed and separated before the percent-encoded octets within
>>   those components can be safely decoded, as otherwise the data may be
>>   mistaken for component delimiters.
>>>>
>> So far so good.
>> <<
>> The only exception is for
>>   percent-encoded octets corresponding to characters in the unreserved
>>   set, which can be decoded at any time.
>>>>
>> which is what you are referring to contradicts the fact that
>> h%74tp:www/example.com/ is not a valid URI
>>
>>
>
> I assume you are working on the basis that the name-value pairs that
> we define fall under the general understanding of sub-components in
> rfc3986? It can't be components, since they are defined in section 3
> as Scheme, Path, Quer, and Fragment. I further assume that because we
> use "=" as a subdelimiter, which is a reserved character, you regard
> the name and value as a sub-component, as described in 2.2?
>
> I think under these circumstances, it may indeed already be defined
> what needs to be percent-encoded and what not...
>
> However, I fail to see how h%74tp:www/example.com/ could ever be a
> valid URI, even given these circumstances.

I have big fingers today, I wanted to type h%74tp://www.example.com/ :)

-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves

Received on Thursday, 1 July 2010 14:04:38 UTC