- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Tue, 17 Jul 2012 16:44:39 +0900
- To: James M Snell <jasnell@gmail.com>
- CC: ietf-http-wg@w3.org
Hello James, On 2012/07/14 7:16, James M Snell wrote: > +------------------------------------+ > |0| Flags (7) | Length (24) | > +------------------------------------+ > | ID | Value Length (int32) |Value...| > +------------------------------------+ > > The ID is a 32-bit number uniquely identifying the registered field. Each > is assigned by the registrar. For instance, the "Host" field could have a > registered value of "1", the "Accept-Lang" field could have a registered > value of "6", and so forth. > > The Value Length is a 32-bit value indicating the length of the value. I agree with Poul-Henning that this is way too long. > If Flag 0x1 is set, the value is assumed to contain character data. When > set, the value MUST be preceded by a single unsigned 8-bit integer > identifying the character encoding utilized. The values are assigned by the > registrar. For instance, US-ASCII could have a registered value of "1", > while "UTF-8" could have a registered value of "2". The IANA charset registry already has MIBenum numbers. It would reduce registry effort a lot if these could be reused. More than one number for one and the same thing will create confusion. Unfortunately, theese numbers are in the range of 0-3000 or so, which doesn't fit into 8 bits. But a much, much better solution in this day and age is to only allow one encoding, UTF-8. That by definition includes US-ASCII, covers all the world's characters, and is what HTML is moving towards (with quite surprising speed these days). And while in HTML (and other content formats), non-ASCII is extremely widespread, in HTTP, it is not, and having more than one encoding is needlessly complicated. Regards, Martin.
Received on Tuesday, 17 July 2012 07:45:18 UTC