W3C home > Mailing lists > Public > www-talk@w3.org > March to April 2010

Re: non ascii character in headers?

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 03 Mar 2010 11:51:46 +0100
Message-ID: <4B8E3F42.4050604@gmx.de>
To: www-talk@w3.org
On 03.03.2010 10:41, Reinier Post wrote:
> On Tue, Mar 02, 2010 at 02:17:13PM +0100, Julian Reschke wrote:
>> On 02.03.2010 00:49, Brendan Miller wrote:
>>> I'm looking at a possible bug in my companies http handling library.
>>> The code seems to assume that there are no bytes with the higher order
>>> bit set in the http Location header. I'm thinking this will break if
>>> the Location header's URI contains non-ascii characters.
>>
>> In which case it wouldn't be a valid URI.
>>
>>> Is my thinking correct, or is there some rule that prohibits non-ascii
>>> chars in an http header?
>>
>> Valid URIs never contain non-ASCII characters.
>
> This is not true, see section 2,1 of the spec:
>
>    http://www.ietf.org/rfc/rfc2396.txt

"In local or regional contexts and with improving technology, users 
might benefit from being able to use a wider range of characters; such 
use is not defined by this specification. Percent-encoded octets 
(Section 2.1) may be used within a URI to represent characters outside 
the range of the US-ASCII coded character set if this representation is 
allowed by the scheme or by the protocol element in which the URI is 
referenced. Such a definition should specify the character encoding used 
to map those characters to octets prior to being percent-encoded for the 
URI." -- <http://greenbytes.de/tech/webdav/rfc3986.html#rfc.section.1.2.1>

>> That doesn't mean it doesn't happen in the wild, though.
>
> Well, a lot of code that handles URI will assume it's ASCII.

And it's correct in doing so (just check the ABNF).

Best regards, Julian
Received on Wednesday, 3 March 2010 10:52:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 27 October 2010 18:14:31 GMT