- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Tue, 16 Oct 2012 13:29:13 +0200
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Cc: Robin Berjon <robin@w3.org>, Ted Hardie <ted.ietf@gmail.com>, Larry Masinter <masinter@adobe.com>, "plh@w3.org" <plh@w3.org>, "Peter Saint-Andre (stpeter@stpeter.im)" <stpeter@stpeter.im>, "Pete Resnick (presnick@qualcomm.com)" <presnick@qualcomm.com>, "www-archive@w3.org" <www-archive@w3.org>, "Michael(tm) Smith" <mike@w3.org>
On Tue, Oct 16, 2012 at 7:36 AM, "Martin J. Dürst" <duerst@it.aoyama.ac.jp> wrote: > Do we want to make sure that all other places that accept URIs or IRIs also > accept a space and treat it the same? Maybe we would like to do so, but is > it possible? Quite clearly no (just think HTTP request header). > > This essentially means that the fork is already here. In some sense, that's > really bad news. This makes no sense. For an incoming request you first look for CRLF, then split on SP, and only then can you start thinking about parsing the URL. (See http://tools.ietf.org/html/rfc2616#section-5.1 for referenced tokens.) And of course that URL cannot contain SP, but that does not mean you cannot parse it with the same parser that can deal with URLs containing SP. I'm not arguing URLs should be allowed to contain SP, just that they can (and do) in certain contexts and that we need to deal with that (either by terminating processing or converting it to %20 or ignoring it in case of domain names, if I remember correctly). The rest of your email did make sense to me :-) (Though I should probably add I do plan on defining what a valid URL is too without reference. It seems cumbersome to have to look at a different document for that and from testing browsers/servers seem to exchange a wider set of characters than STD 66 allows, none of which are harmful.) -- http://annevankesteren.nl/
Received on Tuesday, 16 October 2012 11:29:42 UTC