Re: URL parsing

On Wed, Apr 28, 2010 at 8:54 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
>> In the case you mention, my recollection is that 3 out of 4 browsers
>> agree that you should lowercase the scheme.  Based on that evidence,
>> I'd probably recommend that the wayward browser also lowercase the
>> scheme.  However, I've haven't looked into these issues in enough
>> detail to know if there are other considerations that might cause us
>> to prefer that browsers not lowercase the scheme.
>
> As far as I understand, HTML5 used to require that no normalization takes
> place (essentially, it was requiring to slice the ... web address ... into
> components, and to return them unmodified). I'm not convinced that there's
> any code out there relying on this...

For what it's worth, whenever we end up defining this, I'm much more
interested to see tests in relation to what the various browsers with
substantial usage base do, than what the HTML5 spec said at some point
in time.

IIRC Ian has acknowledged that the behavior that was defined by the
HTML5 spec needed significant work and advised that it was possibly
better to start from scratch than to base work on the HTML5 spec.

So I wouldn't take the HTML5 spec not matching what browsers do as a
sign that behavior might not matter. A better indicator is if browsers
differ in behavior.

Also, as usual, we're fine with changing our implementation in
firefox, as long as there is data backing up that it's unlikely to
break the web. Such data could be behavior of other browsers, or data
based on significant numbers of web pages. And, as usual, there are no
hard numbers for what constitutes "significant", it'll have to be a
judgment call on a case by case basis.

/ Jonas

Received on Wednesday, 28 April 2010 17:39:39 UTC