Re: Non-hierarchical base URLs (was Re: draft-abarth-url-01 uploaded)

On May 2, 2011, at 7:24 PM, Roy T. Fielding wrote:

> On May 2, 2011, at 6:26 PM, Adam Barth wrote:
> 
>> On Mon, May 2, 2011 at 6:24 PM, Roy T. Fielding <fielding@gbiv.com> wrote:
>>> On May 2, 2011, at 5:42 PM, Adam Barth wrote:
>>>> You're missing the constraint that browser vendors aren't going to
>>>> change their implementations to align with this dream.
>>> 
>>> There is no such constraint.  Real browser developers like to fix
>>> bugs when they are found, particularly when it makes their behavior
>>> more interoperable with existing content.
>> 
>> Perhaps you missed this message:
>> 
>> On Mon, Apr 25, 2011 at 1:38 AM, Maciej Stachowiak <mjs@apple.com> wrote:
>>> On Apr 25, 2011, at 1:27 AM, Julian Reschke wrote:
>>>> Actually, Safari *does* the right thing here.
>>> 
>>> Safari has serious bugs as a result of doing the RFC-compliant thing here. We plan to change to be more like other browsers.
>>> 
>>> Regards,
>>> Maciej
>> 
>> AFAIK, Maciej is about as "real" a browser developer as they come.
> 
> AFAICT, Maciej based that statement on memory instead of an actual
> use case or test, since Safari does parse URIs correctly and so does
> Firefox.  When we come up with an example that is "more like other
> browsers" and is still broken, then we can talk about how to fix it.
> 
> And when we do, all implementations will be taken into consideration.

The specific context of my statement is bugs I looked at fairly recently, but which unfortunately i cannot explain in detail because some of them have serious security consequences and they are as yet unpatched.

In the course of working on some of these bugs, I came to the conclusion that Safari should abandon scheme-independent URL parsing, as most other browsers hardcode knowledge of certain schemes as hierarchical and this seems to result in better real-world compatibility,

I am skeptical of the example where a data: URI is the base URI for a relative reference; while the behavior for this must defined one way or another, I would not expect there to be Web content that depends on a specific choice of behavior here, because (a) data: URLs are rare on the Web; and (b) there's almost nothing sensible you can do in this case. Note that Adam just used <base> for convenience, the example could just as well have been written as an actual data: URL which would then act as the anchor for URLs inside the body. But there are more realistic cases where a relative URL may be resolved against a base of a non-hierarchical URI scheme, e.g.:

<iframe id=foo src="about://blank" onload="test()"></iframe>
<script>
var doc = document.getElementById("foo").contentDocument;
var anchor = doc.createElement("a");
anchor.setAttribute("href", "foo.html")
doc.body.appendChild(anchor);
alert(anchor.href);
</script>


More generally, for Safari we will pick real-world Web compatibility over RFC compliance almost any time the two are in conflict. Furthermore, we will generally assume behavior of other browsers (especially ones that have been popular a long time, such as IE and Firefox) unless there is good reason to believe. We basically don't care what the URL/URI/IRI RFCs say or what non-browser clients do, except as a tiebreaker.

If we can converge on behavior that is sensible and better matches older standards, that's great, but from an implementation POV this is not a high priority, compared to deployed content on the browsable Web.

Regards,
Maciej

Received on Tuesday, 3 May 2011 02:56:26 UTC