Re: Media Fragments URI parsing: pseudo algorithm code

* Philip Jägenstedt wrote:
>> RFC 3986 requires implementations when processing a fragment identifiers
>> to treat %74 and "t" the same regardless of where either occurs, as "t"
>> is not a reserved character and URIs that differ only in the escaping of
>> unreserved characters are defined to be equivalent. So the answer here
>> is "all components". You can only have special requirements for reserved
>> characters when they occur unescaped.
>
>If I understand this correctly, this means that percent-decoding must be  
>performed on all names and values, which I welcome.
>
>However, given this situation, how is it possible to express parsing in a  
>single layer of ABNF? When the ABNF says "t", it really means "t" or  
>"%74", if these are indeed supposed to be equivalent. How do other specs  
>layered on top of URI handle this?

The only way would be to actually say `%x74 / "%72"` in each of these
cases, which would make the grammar rather unreadable. A workaround
would be to require a pre-processing step that removes escaping for
octets that are not reserved and then work with the result.

>(I think it would be cleaner to split the syntax into two levels -- one  
>that identifies arbitrary name-value pairs, and one that is defined in  
>terms of the Unicode strings that those names/values represent.)

I agree.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Tuesday, 6 July 2010 15:33:49 UTC