W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2008

Re: PROPOSAL: i74: Encoding for non-ASCII headers

From: Jamie Lokier <jamie@shareable.org>
Date: Thu, 3 Apr 2008 21:16:45 +0100
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Mark Nottingham <mnot@mnot.net>, "Roy T. Fielding" <fielding@gbiv.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20080403201645.GA31170@shareable.org>

I agree with everything Julian said, except:

Julian Reschke wrote:
> TEXT already allows C1 controls (and always did) 
> (<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html#rule.TEXT>):
> 
>   TEXT           = %x20-7E | %x80-FF | LWS
>                  ; any OCTET except CTLs, but including LWS
> 
> That being said, I'd like it to exclude C1 controls.

If C1 controls _in the form of octets %x80-9F_ are excluded, and HTTP
agents begin to reject TEXT containing those octets, it will be harder
to transition to UTF-8 later.  (In case you'd forgotton, UTF-8 uses
those octet values for normal characters).

In other words, rejecting %0x80-9F pretty much commits the high octets
of TEXT to representing iso-8859-1.  Even if nobody uses it (like now
it seems), there may be agents which don't send it, but reject
%0x80-9F if it's recommended by an HTTP RFC.

Excluding C1 controls _encoded as UTF-8_ is quite reasonable.  But
then, there are lots of other controls one might wish to exclude too -
preferably by saying senders SHOULD NOT send them, but perhaps
receivers shouldn't reject them.

-- Jamie
Received on Thursday, 3 April 2008 20:17:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 27 April 2012 06:50:46 GMT