Re: lower casing host names

On Fri, Dec 30, 2011 at 7:18 AM, Daniel Stenberg <daniel@haxx.se> wrote:
> I've walked into an issue I just wanted to bring here for attention and
> possibly some feedback.
>
> A curl user noticed that when he reqeusted a URI with a mixed case host
> name, the site would redirect to the same host name with all letters in
> lower case. (And since curl would re-use the same connection and the same
> Host: header it resent the same mixed case request that again gets
> redirected... and it turns into a nice loop.)
>
> I consider treating the host name differently only based on different casing
> a protocol violation of the target server/software in question, but curious
> about this I tried out chrome and firefox to see how they handle this case.
>
> It turns out both browsers always unconditionally lower case the host name
> in URIs so they never send HTTP requests with mixed case.
>
> Why do they do this?

IE has this behavior as well.  Safari does not, but it's definitely
the odd-browser out in this regard.  I either wrote or reviewed a
patch that canonicalized host names to lower case for Safari, but it
ran into a problem with safari-extension URLs (don't ask).

To answer your specific question, Chrome has this behavior because IE
does.  Chrome's URL process behavior is modeled as closely as possible
after IE (except in cases where IE's behavior is certifiably insane,
in which case it's modeled after Firefox).  In this case, because both
IE and Firefox agree, choosing this behavior was an easy call.  This
behavior is collectively stable and therefore unlikely to change in
the future.

> Is this behavior of treating names differently based on
> case common? If so, should httpbis mention it?
>
> Unfortunately this will now also push curl towards this behavior. We haven't
> yet decided how to act, but forcibly lowercasing made it work fine against
> this particular host where this issue arose...

It's likely to work fine for the vast majority of web sites on the
Internet (and in most intranets).  I suspect it's not really something
for HTTPbis to concern itself with, however.  The correct place for
this requirement is in the as-yet-non-existent URL specification
because host name canonicalization is visible in many places beyond
just the HTTP Host header.

Adam

Received on Saturday, 31 December 2011 00:15:32 UTC