Re: Factoring out Content-Disposition (i123), was: Content-Disposition (new issue?)

Brian Smith wrote:
> I don't think UTF-16 support is worthwhile. I think it isn't an issue if the
> intention is to support only very short, non-prose text like filenames (see
> below). It seems there is some agreement that HTTP headers should not
> contain human-oriented text. My concern is that having a separate standard
 > ...

I agree that it's a good thing to avoid human-readable test in headers. 
But I really don't see how to always follow that rule.

> for RFC2231 in HTTP will promote the idea of human-oriented text in headers
> instead of discouraging it.

RFC2231 encoding already *is* used in HTTP, and it's used in a situation 
where the data *can't* be in the entity body.

Clarifying RFC2231 for use in HTTP has the main purpose to get the two 
UA vendors who don't "get" it to finally implement it correctly.

> ...
> Language tagging, BIDI, and accessibility features are not really necessary
> for the specific case of filenames. Those issues come into play when you try
> to define a general-purpose mechanism for supporting human-oriented text.
> For example, RFC 2231 only allows a language tag for the entire parameter
> value, but doesn't provide a means of handling mixed-language text.
> ...

Yes, that's a restriction.

>>> Nitpicks:
>>> The draft references Unicode 4.0 indirectly through 
>>> RFC3629. It would be better to allow implementations to use
>>> any later versions, or at least the current version, 5.1.
>> Yes, that's a nit, isn't it :-).
> Yes, but this issue seems to always come up when specifications reference
> Unicode documents.

If you feel strongly about it, you may want to start working on getting 
RFC3629 revised.

>>> I don't see the point of requiring ISO-8859-1. ISO-8859-1 can only 
>>> encode a very small number of languages that are used by a small 
>>> minority of people (who just happen to be over-represented in 
>>> standards committees). Advocating
>>> ISO-8859-1 also seems to be the opposite of what was 
>>> discussed at the IETF meeting (AFAICT from the logs).
>> I originally want to mandate UTF-8 only, but people pointed 
>> out (rightfully), that any HTTP software already needs to 
>> understand ISO-8859-1, so it really doesn't make a difference.
> Judging from Roy's response, it looks like software won't have to understand
> more than ASCII, though they will have to tolerate non-ASCII bytes
> (presumably, regardless of whether those bytes can be decoded into valid
> characters in any encoding). Historically, ISO-8859-1 seems to be very
> difficult for implementers to get right since Windows-1252 and other similar
> encodings are often sent as ISO-8859-1.

That's true. Not sure what point you're trying to make, though? Do not 
mention ISO-8859-1 because some implementors confuse it with different 

BR, Julian

Received on Saturday, 16 August 2008 07:58:08 UTC