On Fri, Mar 7, 2008 at 8:08 PM, Brian Wilson <bloo@blooberry.com> wrote: > > MAMA (the name of my tool) found ~420 URLs out of about 3.5 million > tried with xhtml 1.0 and application/xhtml+xml. Not nearly as many as > your above URL space found. Interesting. I'd attribute the difference to the fact that MAMA was choosing arbitrary URLs while Nikita is directed to sites created by people that are more standards-aware than average. > I'd like to check for UA and other types of HTTP request header > discrimination in a future crawl. Do you mean request the same URL multiple times with different UA headers? I like that idea. -- Philip http://NikitaTheSpider.com/ Whole-site HTML validation, link checking and moreReceived on Sunday, 9 March 2008 15:23:27 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:59:06 UTC