I agree with everything Julian said, except: Julian Reschke wrote: > TEXT already allows C1 controls (and always did) > (<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html#rule.TEXT>): > > TEXT = %x20-7E | %x80-FF | LWS > ; any OCTET except CTLs, but including LWS > > That being said, I'd like it to exclude C1 controls. If C1 controls _in the form of octets %x80-9F_ are excluded, and HTTP agents begin to reject TEXT containing those octets, it will be harder to transition to UTF-8 later. (In case you'd forgotton, UTF-8 uses those octet values for normal characters). In other words, rejecting %0x80-9F pretty much commits the high octets of TEXT to representing iso-8859-1. Even if nobody uses it (like now it seems), there may be agents which don't send it, but reject %0x80-9F if it's recommended by an HTTP RFC. Excluding C1 controls _encoded as UTF-8_ is quite reasonable. But then, there are lots of other controls one might wish to exclude too - preferably by saying senders SHOULD NOT send them, but perhaps receivers shouldn't reject them. -- JamieReceived on Thursday, 3 April 2008 20:17:30 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 October 2011 12:14:01 GMT