Re: UTF-8 or ASCII Header Names?

On 2013/08/17 1:49, James M Snell wrote:
> On Fri, Aug 16, 2013 at 9:29 AM, Roberto Peon<grmocg@gmail.com>  wrote:
>> I view it as liberating-- as the compressor is now freed from worrying about
>> normalization, etc. which, if done, should be done at a higher layer.
>>
>
> FWIW, I don't believe anyone had said anything about normalization...
> valid UTF-8 octets, yes, but not normalization. The compression
> mechanism is really not affected by whether or not we say UTF-8
> here...

Yes indeed, this should be clearly layered. Compression can be done 
binary (making sure we have a compression method that works well for 
UTF-8, and even better for ASCII).

Checking for valid UTF-8 can be done at the end. And please note that 
checking for valid UTF-8 can be really fast, I'd guess something like 
two to three machine instructions per byte. But still we don't need to 
do it when a proxy just moves the headers from left to right.

Unicode Normalization then is again one level higher, and depends on the 
header in question, as I already wrote.

Regards,    Martin.

Received on Saturday, 17 August 2013 13:13:17 UTC