- From: Greg Wilkins <gregw@intalio.com>
- Date: Sat, 2 Aug 2014 12:22:52 +1000
- To: Roberto Peon <grmocg@gmail.com>
- Cc: Jeff Pinner <jpinner@twitter.com>, Jason Greene <jason.greene@redhat.com>, HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAH_y2NFF4nmTWAsg+sxgC5=QkrVDcDkpGQJ=SHG6Dkh0RbJ3bA@mail.gmail.com>
On 2 August 2014 10:53, Roberto Peon <grmocg@gmail.com> wrote: > I don't recall that particular change being something I saw consensus > about, and certainly something that the discussion (at least AFAICT) didn't > have a resolution for. > I think that this was a detail left to the editor once consensus was declared on the removal of the reference set. Like dropping the copy etc. There was discussion about which index was best to be low and I provided some data on the public address set, but in the end it was the editors call I think. I think that the change is likely to decrease compressor efficiency, and it > *certainly* will for any algorithm which searches the state context first > to see if there are any exact matches. > Can we move beyond "I think". This is something for which it is possible to achieve real numbers. This was done of a publicly available data set that had been used for the basis of removing the RefSet in the first place, and it indicated very little change either way with regards to the index. I'll revert it as soon as possible. > ummmm do you have the authority to do that? more detail replies below In order to know which index one must use, one must scan the entire table > (didn't used to be the case with the reference set, but, that is gone) and > look for matches for each header. If one scans the static table first, then > one very likely wastes CPU. One can optimize a bit by only dong so if the > length of the value is < max_static_table_value_length > Hash don't scan! For my own implementation there is no lookup difference between having the static indexes low or high. But there is a benefit of having the static headers at fixed index's as I can pre-generate the bytes. The new approach affords increased efficiency to the first request, at the > cost of decreased efficiency to any subsequent request. > Can you back up that assertion with any real numbers? It is certainly not the case for my own implementation as it does hash lookups for fields and then names, so size of table nor length of index are factors in neither. You need to create at least 65 unique fields before there is any additional data cost, which I would then suggest is a tiny fraction of the cost of sending 65 unique fields in the first place. But show me a data set that it a good general case that indicates having the indexes the other way around is better and if we are to make any breaking changes after -14 then I'll support swapping back. > Actually, it doesn't afford any efficiency to the first request. > It affords more efficiency IFF more replacement happens than referencing > and replacement isn't done from the header table. > Again, can you show any actual numbers? For my own implementation, having the indexes low is a marginal improvement (same lookups but less branching). In terms of data efficiency you need 65 custom indexed fields before it makes any difference. numbers please. -- Greg Wilkins <gregw@intalio.com> http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales http://www.webtide.com advice and support for jetty and cometd.
Received on Saturday, 2 August 2014 02:23:21 UTC