- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Mon, 06 Aug 2012 16:21:03 +0900
- To: Roberto Peon <grmocg@gmail.com>
- CC: James M Snell <jasnell@gmail.com>, Mike Belshe <mike@belshe.com>, Jonathan Ballard <dzonatas@gmail.com>, Poul-Henning Kamp <phk@phk.freebsd.dk>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
On 2012/08/04 2:33, Roberto Peon wrote: > I'm biased against utf-8, because it is trivial at the application layer to > write a function which encodes and/or decodes it. It's maybe trivial to write such functions once, but it's a total waste of time to write them over and over. > I see that handling utf-8 adds complexity What complexity? > to the protocol but buys the > protocol nothing. It doesn't buy the protocol itself much. But it buys the users of the protocol a lot. > It adds minimal advantage for the entities using the > protocol, and makes intermediaries lives more difficult since they'll have > to do more verification. > > Saying that the protocol handles sending a length-delimited string or a > string guaranteed not to include '\n' would be fine, however, as at that > point whatever encoding is used in any particular header value is matter of > the client-app and server, as it should be for things that the protocol > doesn't need to know about. No, it is not fine. First, for most headers, interoperability should be between all clients and all servers. Second, it is absolutely no fun for client apps developers to solve the same character encoding problem again and again. It's just useless work, prone to errors. If you got told today that the host header can be in ASCII or EBCDIC, it's just between your client and your server, what would you say? Regards, Martin.
Received on Monday, 6 August 2012 07:21:35 UTC