The Refresh header is still with us

Hi friends!

The other day someone filed a bug[1] on curl that we don't support redirects 
with the Refresh header. This took me down a rabit hole and I figured I would 
share with you what I learned down there.

As you all know, redirects in HTTP is specified to use 3xx response codes and 
a Location: header to point out the the URL (I'll use the term URL here but 
you know what I mean). This has been the case since RFC 1945 (HTTP/1.0). 
According to an old mail[2] from Roy, Refresh "didn't make it" into that spec.

The little detail that it never made it into the 1.0 spec (nor any later one) 
doesn't seem to have affected the browsers. Still today, browsers keep 
supporting[3] the Refresh header as a sort of Location: replacement even 
though it seems to never have been present in a HTTP spec.

How frequent is the use of the Refresh header? I decided to make an attempt to 
figure out, and for this venture I used the Rapid7 data trove[4]. The method 
that data is collected with may not be the best, but it is still 52+ million 
HTTP responses from different current HTTP servers. (52254873 exactly in my 
data dump)

My counts show:

  - Location is used in 18.49% of the responses
  - Refresh is used in 0.01738% of the responses
  - Location is thus used 1064 times more often than Refresh
  - In 35% of the cases when Refresh is used, Location is *also* used
  - curl thus handles 99.9939% of the redirects in this test

Other random notes:

  - When Refresh is the only redirect header, the response code is usually 200
    (with 404 being the second most)
  - When both headers are used, the response code is almost always 30x
  - When both are used, it is common to redirect to the same target and it is
    also common for the Refresh header value to only contain a number (for
    the number of seconds until "refresh").

Contents:

Redirects can also be done by meta tags and sending the refresh that way, but 
I have not investigated how common as that isn't strictly speaking HTTP so it 
is outside of my research (and interest) here.

Conclusion:

Nah, sorry, I don't have any. Yet another undocumented quirky corner of the 
web I suppose.

[1] = https://github.com/curl/curl/issues/3657
[2] = 
https://lists.w3.org/Archives/Public/ietf-http-wg-old/1996MayAug/0594.html
[3] = http://www.otsukare.info/2015/03/26/refresh-http-header
[4] = https://opendata.rapid7.com/sonar.http/
[5] =

-- 

  / daniel.haxx.se

Received on Monday, 11 March 2019 09:28:37 UTC