Re: Bug 85/4494 (keeping track of validation statistics for various purposes)

On Fri, Mar 7, 2008 at 8:08 PM, Brian Wilson <bloo@blooberry.com> wrote:
>
>  MAMA (the name of my tool) found ~420 URLs out of about 3.5 million
>  tried with xhtml 1.0 and application/xhtml+xml. Not nearly as many as
>  your above URL space found.

Interesting. I'd attribute the difference to the fact that MAMA was
choosing arbitrary URLs while Nikita is directed to sites created by
people that are more standards-aware than average.

>  I'd like to check for UA and other types of HTTP request header
>  discrimination in a future crawl.

Do you mean request the same URL multiple times with different UA
headers? I like that idea.


-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more

Received on Sunday, 9 March 2008 15:23:27 UTC