- From: Martin Nilsson <nilsson@opera.com>
- Date: Fri, 11 Oct 2013 18:42:30 +0200
- To: ietf-http-wg@w3.org
So I've finally scrubbed the mobile HTTP header captures that I mentioned earlier. The capture was made on one of our Opera Mini download servers in March. The request headers were then parsed and the most broken requests removed. After that all requests coming from internal systems, monitoring and Opera Mini servers were removed. I'm aware that there are some cases where the headers are not parsed correctly when the linebreaks keep changing between different header lines. Finally the headers are scrubbed where everything resembling IP numbers and longer sequences of digits are replaced with random, similar looking data. Also the header values of y-msisdn, referer, authorization and proxy-authorization are replaced with random data. All request payload data is stripped as well. The resulting data is written into a a json structure with one object per request where lower case header names maps to a string, or an array of strings in the case of multiple headers. Order is preserved in the array, but not amongst the different headers (they are serialized sorted). The file contains 203586 requests and is 154MB uncompressed, 20MB compressed. http://people.opera.com/nilsson/headers.json.gz Please report report any issues or concerns. /Martin Nilsson -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Received on Friday, 11 October 2013 16:42:57 UTC