Re: Factoring out Content-Disposition (i123)

William A. Rowe, Jr. wrote:

>> As long as that doesn't change Latin-1 is the only 
>> permitted form of any non-ASCII octets in HTTP/1.1
>> headers.
 
> I'm becoming very confused.

Sorry, I should have written that I'm talking about
raw / unencoded octets (bytes, decimal 128..255).

As soon as you encode charsets using these octets,
e.g., with RFC 2047 / 2231 techniques, you can of
course use other charsets.

In the case of the "slugtext" encoding you can only
use (raw) UTF-8 input arriving at percent-encoded
(ASCII) output in the HTTP header field.

The RFC 2047 / 2231 encodings also arrive at ASCII
output for input in any charset.  Output limited to
bytes decimal 32..126 is no problem wrt HTTP, that
is a proper subset of ASCII, and ASCII is a proper 
subset of Latin-1 => "good".

However raw unencoded UTF-8 is not a proper subset
of Latin-1 => "bad".  

Last but not least raw unencoded Latin-1 is "ugly",
especially when it is something else (windows-1252
or worse), as Brian said.

 Frank

Received on Saturday, 16 August 2008 08:20:12 UTC