Re: Draft finding - "Transitioning the Web to HTTPS"

Eric J. Bowman wrote, back on 19 December [sorry for slow reply!]:

> Henry S. Thompson wrote:
>> 
>> Some non-anecdotal evidence, albeit still subject to varying
>> interpretations, is available in a talk summarising my analysis of 
>> two sets of cache-logs, from June 2013 and June 2014:
>> 
>>   http://www.ltg.ed.ac.uk/~ht/HST_noREST.pdf
>> 
>> Start at slide 13 and stop after slide 15 if you're not interested in
>> my critique of REST, but just want to see the numbers.
>> 
> ...
> Except the conneg stuff. Are you really saying nobody compresses HTTP
> payloads on the wire? Because that's a real-world instance of conneg I
> highly doubt nobody uses. Personally, I cache compressed content and
> unzip it on the fly, to save CPU on the Celerons driving the budget
> webhosting world, which finally got around to Vx and threading but still
> aren't up to the task of ubiquitous HTTPS any more than the SPARC T1.
>
> What forms of conneg were you looking for, but apparently didn't find?

The Squid logs which is what I was working with don't contain any
request or response headers, just the response status code.  The only
evidence available of what 2616 [1] calls "server-driven negotiation"
and 7231 calls "proactive negotiation" is a 406 (Not Acceptable)
response, indicating that the server has no representation satisfying
the Accept... headers in a request.  I found only a handful of 406
responses, none of which appeared to be actually responding to an
attempt at conneg.  Note that this kind of conneg is what I think most
people, including you, understand by "content negotiation" -- it's
certainly what the TAG's _WebArch_ [3] and _Alternative
Representations_ [4] are discussing.  Somewhat surprisingly (to me at
least), it's also clearly recommended _against_ by 2616 and 7231.

What they _recommend_ is what 2616 calls "agent-driven" [5] and 7231
"reactive" [6] conneg.  This involves a server responding to a GET
with a 300 Multiple Choices response, from which a user agent then
selects, either automatically or by reference to a human.  Presence of
300 responses in the log would then constitute unequivocal evidence of
"reactive" conneg.  But in fact what there is turns out to actually be
evidence _against_ (conformant) conneg.  _All_ the examples in the log
were generated by Apache's mod_speling [sic] module, offering "common
basename", "character missing" or "mistyped character" hypotheses
about failures to find a requested URI.

In conclusion, there is concrete evidence the servers do _not_
implement what the RFCs recommend, and indirect evidence that there's
very little of what I at least _thought_ was the kind of conneg
clients and servers _did_ use.

To be clear and careful, if I understand you correctly you might be
right -- there might be a huge amount of traffic consisting of
"Accept-Encoding: gzip" requests from clients and gzip-encoded
responses from servers -- such successes will not be detectable as
such in the logs.  All I can conclude from the lack of conformant 400
responses in the logs is that clients were rarely if ever sending
Accept-Encoding headings which _require_ compression to servers that
couldn't comply.

ht

[1] http://tools.ietf.org/html/rfc2616#section-12.1
[2] http://tools.ietf.org/html/rfc7231#section-3.4.1
[3] http://www.w3.org/TR/webarch/#def-coneg
[4] http://www.w3.org/2001/tag/doc/alternatives-discovery.html#id2261787
[5] http://tools.ietf.org/html/rfc2616#section-12.2
[6] http://tools.ietf.org/html/rfc7231#section-3.4.2
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]

Received on Tuesday, 17 February 2015 12:12:18 UTC