Re: parsing URI (references) according to RFC 3986

Regardless of whether you consider it a bug, browsers need to parse
different schemes differently.  If your proposal doesn't do that, then it's
not going to work.  (It's not just a matter of post-processing goofy
characters.)

Adam
 On Jun 19, 2011 3:45 AM, "Julian Reschke" <julian.reschke@gmx.de> wrote:
> On 2011-06-19 06:54, Adam Barth wrote:
>> ...
>> The test suite above should be easy to parse and deal with.
>> ...
>
> Well, last time I checked it wasn't easy to me.
>
>> By the way, how does your proposal deal with the fact that different
>> schemes are parsed different?
>
> The proposal to use RFC 3986?
>
> Schemes are not supposed to parse differently. When it happens, it's a
> *bug*.
>
> That being said, to handle a specific scheme differently requires
> extracting the scheme component first, right?
>
> Once you have done this, you can apply any kind of post-processing to
> the individual components to get the scheme specific handling you want.
> I *did* mention this in my mail:
>
>> - optional postprocessing (fix non-ASCII characters in query parameter
when not originating from UTF-8 encoded document; maybe scheme-specific
cleanup).
>
> Best regards, Julian

Received on Sunday, 19 June 2011 13:49:39 UTC