- From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- Date: Sat, 17 Aug 2013 22:17:50 +0900
- To: Roberto Peon <grmocg@gmail.com>
- CC: James M Snell <jasnell@gmail.com>, Martin Thomson <martin.thomson@gmail.com>, Fred Akalin <akalin@google.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
On 2013/08/17 1:57, Roberto Peon wrote: > In addition to compressing the bytestrings, the compressor will have to > validate utf-8. Nearly the same complexity as normalization (which was > proposed earlier) to me-- I now get to scan things yet another time, Sorry, but that's not true. You can get close to it being true for ASCII-only data, but that's about it. Checking for UTF-8 validity is a very small state machine (around 10 states) looking at one byte a time, and it can only succeed or fail. Normalization needs lots of data (a few 10K bytes) for lookup, may need a buffer of indefinite length, may lengthen or shorten the data, and so on. > increasing CPU utilization.. for what? Basically nothing in return if the > upper-level doesn't care about it. > > If the upper-level cares about it, then it should be a prereq of feeding > something into the compressor. If not, then it shouldn't be. Either way, > these concerns belong outside the compressor. I agree that this should be outside the compressor. Regards, Martin.
Received on Saturday, 17 August 2013 13:18:40 UTC