W3C home > Mailing lists > Public > public-media-fragment@w3.org > July 2010

Re: Media Fragments URI parsing: pseudo algorithm code

From: Yves Lafon <ylafon@w3.org>
Date: Thu, 1 Jul 2010 08:47:11 -0400 (EDT)
To: Bjoern Hoehrmann <derhoermi@gmx.net>
cc: Philip Jägenstedt <philipj@opera.com>, public-media-fragment@w3.org
Message-ID: <alpine.DEB.1.10.1007010840070.31115@wnl.j3.bet>
On Thu, 1 Jul 2010, Bjoern Hoehrmann wrote:

> * Yves Lafon wrote:
>> On Wed, 30 Jun 2010, Bjoern Hoehrmann wrote:
>>
>>>> The disagreement here is only for which components to decode
>>>> percent-encoding, RFC3986 will not help us.
>>>
>>> RFC 3986 requires implementations when processing a fragment identifiers
>>> to treat %74 and "t" the same regardless of where either occurs, as "t"
>>> is not a reserved character and URIs that differ only in the escaping of
>>> unreserved characters are defined to be equivalent. So the answer here
>>> is "all components". You can only have special requirements for reserved
>>> characters when they occur unescaped.
>>
>> URI equivalence is an endlees source of fun :)
>> are http://www.example.com/ (1) and http://www.example.com:80/ (2) and
>> h%74ttp:www/example.com/ (3) equivalent ?
>> From what you say, at least (1) and (3) should be.
>
> Well, http://www.websitedev.de/temp/rfc3986-check.html.gz tells me (3)
> is neither a URI nor a URI-reference so the question does not arise. For
> (1) and (2) the answer is scheme-specific. Neither has a bearing on the
> case of fragment identifiers as they are scheme-independent and allow
> percent-encoding everywhere.

(3) is not a URI because the ABNF doesn't allow percent encoding in the 
scheme.
But rfc3986 2.4.  When to Encode or Decode says:
<<
When a URI is dereferenced, the components and subcomponents
    significant to the scheme-specific dereferencing process (if any)
    must be parsed and separated before the percent-encoded octets within
    those components can be safely decoded, as otherwise the data may be
    mistaken for component delimiters.
>>
So far so good.
<<
The only exception is for
    percent-encoded octets corresponding to characters in the unreserved
    set, which can be decoded at any time.
>>
which is what you are referring to contradicts the fact that
h%74tp:www/example.com/ is not a valid URI


-- 
Baroula que barouleras, au tiéu toujou t'entourneras.

         ~~Yves
Received on Thursday, 1 July 2010 12:47:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 21 September 2011 12:13:39 GMT