Mobile headers from Martin Nilsson on 2013-10-11 (ietf-http-wg@w3.org from October to December 2013)

From: Martin Nilsson <nilsson@opera.com>
Date: Fri, 11 Oct 2013 18:42:30 +0200
To: ietf-http-wg@w3.org
Message-ID: <op.w4st04ciiw9drz@uranium>

So I've finally scrubbed the mobile HTTP header captures that I mentioned  
earlier. The capture was made on one of our Opera Mini download servers in  
March. The request headers were then parsed and the most broken requests  
removed. After that all requests coming from internal systems, monitoring  
and Opera Mini servers were removed. I'm aware that there are some cases  
where the headers are not parsed correctly when the linebreaks keep  
changing between different header lines. Finally the headers are scrubbed  
where everything resembling IP numbers and longer sequences of digits are  
replaced with random, similar looking data. Also the header values of  
y-msisdn, referer, authorization and proxy-authorization are replaced with  
random data. All request payload data is stripped as well. The resulting  
data is written into a a json structure with one object per request where  
lower case header names maps to a string, or an array of strings in the  
case of multiple headers. Order is preserved in the array, but not amongst  
the different headers (they are serialized sorted). The file contains  
203586 requests and is 154MB uncompressed, 20MB compressed.

http://people.opera.com/nilsson/headers.json.gz

Please report report any issues or concerns.

/Martin Nilsson

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Received on Friday, 11 October 2013 16:42:57 UTC