- From: Adam Barth <ietf@adambarth.com>
- Date: Tue, 19 Apr 2011 23:58:12 -0700
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Maciej Stachowiak <mjs@apple.com>, public-iri@w3.org
On Tue, Apr 19, 2011 at 11:51 PM, Julian Reschke <julian.reschke@gmx.de> wrote: > On 20.04.2011 08:42, Adam Barth wrote: >> >> On Tue, Apr 19, 2011 at 11:31 PM, Julian Reschke<julian.reschke@gmx.de> >> wrote: >>> >>> On 20.04.2011 08:22, Adam Barth wrote: >>>> >>>> ... >>>> It's a moderate problem in practice. For example, every browser I'm >>>> aware of has had (historically) security bugs arising from subtly >>>> different URL processing by various components. We also have examples >>>> of compatibility problems with web sites arising from different URL >>>> processing by browsers. >>>> ... >>> >>> Yes. Sure. >>> >>> My question was: do the differences in the behavior of the decomposition >>> attributes cause problems in practice? What type of code is using them? >>> (I really want to know :-). >> >> I'm not sure I fully understand what question you're asking, but >> segmenting URLs into components is super important. For example, at >> least one of the security bugs I referred to above revolved around two >> different URL parsers segmenting the host differently, leading to >> disagreement about which security context the URL belonged to. >> ... > > I'm referring to the attributes exposed in the DOM (visible to JS), as > opposed that what implementations do internally (which we can only observe > indirectly). Ah, those are a mess and in serious need of a bath. Fortunately for us, that's outside the scope of this working group. To the larger thrust of your question, interoperability problems between browsers leads to much sadness, even in seemingly obscure APIs. Personally, I don't know of any specific historical compatibility problems with these APIs, but others might. I can tell you that canonicalizing the query component of URLs sent via form requests is an extremely sensitive area for compatibility. My understanding is that is because there are a lot of hand-written CGI parsers written by folks who might not fully understand the difference between different character encodings. Adam
Received on Wednesday, 20 April 2011 06:59:10 UTC