Re: [whatwg] New URL Standard from Anne van Kesteren on 2012-09-24 (public-whatwg-archive@w3.org from September 2012)

On Tue, 23 Oct 2012, Mark Nottingham wrote:
> On 23/10/2012, at 9:35 AM, Ian Hickson <ian@hixie.ch> wrote:
> > 
> > Consensus isn't a value I hold highly, but review of Anne's work is 
> > welcome.
> > 
> > If the IETF community didn't want Anne to do this work, then the IETF 
> > community should have done it. Having not done it, having not even 
> > understood that the problem exists, means the IETF has lost the 
> > credibility it needs to claim that this is in the IETF's domain.
> > 
> > You don't get to claim authority over an area while at the same time 
> > telling someone else "please fix that" for the hard work that comes 
> > with that area. The reality is, he who does the hard work, gets the 
> > authority.
> 
> All very interesting, but please address the point that's now been made 
> repeatedly -- why is it necessary for you to redefine URIs, rather than 
> doing as we suggest?

What exactly do you suggest? 

Doing the work but at the IETF? See my reply to James.

Waiting for the IETF to do the work? We did, and timed out.

Not doing the work? That doesn't lead to interop.

Doing the work as a diff spec? That's what we did for a while, but it 
doesn't work. Having to reference three specs (pre-parse, IRI, URI) just 
to parse and resolve a URL is not what leads to implementors having a good 
time and thus not what leads to interop.

What else do you suggest?


On Mon, 22 Oct 2012, Tim Bray wrote:
> >
> >    $ wget 'http://example.com/a b'
> >    --2012-10-23 00:27:43--  http://example.com/a%20b
> >
> >    # test.cgi returns a 301 with "Location: a b"
> >    $ curl -L http://damowmow.com/playground/demos/url/in-http-headers/test.cgi
> >    This file is: http://damowmow.com/playground/demos/url/in-http-headers/a%20b
> 
> Hmm.  I went to tbray.org and made a file at '$ROOT_DIR/tmp/a b' - note 
> the space.
> 
> Then I did
> 
> curl -I 'http://www.tbray.org/tmp/a%20b'
> curl -I 'http://www.tbray.org/tmp/a b'
> 
> Curl, quite properly, doesn't fuck with what I ask it

Instead it makes an invalid HTTP request. Your offensive language 
notwithstanding, that means wget and curl don't interoperate. This is bad. 
This is what we want to fix.


> and revealed a very interesting fact: That my Apache httpd returns 200 
> for both of these, but, with, uh, interesting variations, amounting to 
> what I think is quite possibly a bug.

How could it be a bug, since there's no spec that says how to handle a URL 
with spaces in it?


> I also pasted the version with the space into the nearest Web browser, 
> and it quite properly auto-corrected to a%20b.

Quite properly according to whom? There's no spec that defines this.


> I think it’s a bug that curl is claiming the 301 pointed at "a%20b" not 
> "a b".

You're wrong, but only because the de facto standard of "most software 
does it that way" says so. No IETF spec does. That's the problem.


> Because suppose it had pointed at "a%20b" - I don’t want middleware 
> lying to me.

What you want isn't really the issue. Compatibility with deployed code is 
the issue.


> It seems like a good idea to document the steps by which "a b" pasted in 
> becomes "a%20b" in the address bar. But I don’t see the relevance 
> outside human-authored strings.

All the strings in question are human-authored.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Monday, 22 October 2012 23:26:08 UTC