- From: Jamie Lokier <jamie@shareable.org>
- Date: Thu, 3 Apr 2008 21:16:45 +0100
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Mark Nottingham <mnot@mnot.net>, "Roy T. Fielding" <fielding@gbiv.com>, HTTP Working Group <ietf-http-wg@w3.org>
I agree with everything Julian said, except: Julian Reschke wrote: > TEXT already allows C1 controls (and always did) > (<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html#rule.TEXT>): > > TEXT = %x20-7E | %x80-FF | LWS > ; any OCTET except CTLs, but including LWS > > That being said, I'd like it to exclude C1 controls. If C1 controls _in the form of octets %x80-9F_ are excluded, and HTTP agents begin to reject TEXT containing those octets, it will be harder to transition to UTF-8 later. (In case you'd forgotton, UTF-8 uses those octet values for normal characters). In other words, rejecting %0x80-9F pretty much commits the high octets of TEXT to representing iso-8859-1. Even if nobody uses it (like now it seems), there may be agents which don't send it, but reject %0x80-9F if it's recommended by an HTTP RFC. Excluding C1 controls _encoded as UTF-8_ is quite reasonable. But then, there are lots of other controls one might wish to exclude too - preferably by saying senders SHOULD NOT send them, but perhaps receivers shouldn't reject them. -- Jamie
Received on Thursday, 3 April 2008 20:17:30 UTC