URLs with missing charset for certain TLDs

re http://krijnhoetmer.nl/irc-logs/whatwg/20131122#l-401

data set 2013-09-01 http://webdevdata.org

$ find ./ -name "*.html.txt.hdr.txt" -print0 | xargs -0 -n1 -P8 grep -ELi  
'Content-Type\s*:\s*text/html\s*;\s*charset\s*=' >>  
../no-charset-in-header.txt
$ cd ..
$ python find-missing-charsets.py > hsivonen-missing-charset.txt

-- 
Simon Pieters
Opera Software

Received on Friday, 22 November 2013 14:45:14 UTC