- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 16 Aug 2008 09:57:21 +0200
- To: Brian Smith <brian@briansmith.org>
- CC: ietf-http-wg@w3.org
Brian Smith wrote: > I don't think UTF-16 support is worthwhile. I think it isn't an issue if the > intention is to support only very short, non-prose text like filenames (see > below). It seems there is some agreement that HTTP headers should not > contain human-oriented text. My concern is that having a separate standard > ... I agree that it's a good thing to avoid human-readable test in headers. But I really don't see how to always follow that rule. > for RFC2231 in HTTP will promote the idea of human-oriented text in headers > instead of discouraging it. RFC2231 encoding already *is* used in HTTP, and it's used in a situation where the data *can't* be in the entity body. Clarifying RFC2231 for use in HTTP has the main purpose to get the two UA vendors who don't "get" it to finally implement it correctly. > ... > Language tagging, BIDI, and accessibility features are not really necessary > for the specific case of filenames. Those issues come into play when you try > to define a general-purpose mechanism for supporting human-oriented text. > For example, RFC 2231 only allows a language tag for the entire parameter > value, but doesn't provide a means of handling mixed-language text. > ... Yes, that's a restriction. >>> Nitpicks: >>> >>> The draft references Unicode 4.0 indirectly through >>> RFC3629. It would be better to allow implementations to use >>> any later versions, or at least the current version, 5.1. >> Yes, that's a nit, isn't it :-). > > Yes, but this issue seems to always come up when specifications reference > Unicode documents. If you feel strongly about it, you may want to start working on getting RFC3629 revised. >>> I don't see the point of requiring ISO-8859-1. ISO-8859-1 can only >>> encode a very small number of languages that are used by a small >>> minority of people (who just happen to be over-represented in >>> standards committees). Advocating >>> ISO-8859-1 also seems to be the opposite of what was >>> discussed at the IETF meeting (AFAICT from the logs). >> I originally want to mandate UTF-8 only, but people pointed >> out (rightfully), that any HTTP software already needs to >> understand ISO-8859-1, so it really doesn't make a difference. > > Judging from Roy's response, it looks like software won't have to understand > more than ASCII, though they will have to tolerate non-ASCII bytes > (presumably, regardless of whether those bytes can be decoded into valid > characters in any encoding). Historically, ISO-8859-1 seems to be very > difficult for implementers to get right since Windows-1252 and other similar > encodings are often sent as ISO-8859-1. That's true. Not sure what point you're trying to make, though? Do not mention ISO-8859-1 because some implementors confuse it with different encodings? BR, Julian
Received on Saturday, 16 August 2008 07:58:08 UTC