W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2011

Re: Barewords in on* attributes, redux (also, find() and company)

From: Simon Pieters <simonp@opera.com>
Date: Wed, 14 Dec 2011 09:01:57 +0100
To: public-webapps@w3.org, "Boris Zbarsky" <bzbarsky@mit.edu>
Message-ID: <op.v6gy9jqlidj3kv@simon-pieterss-macbook.local>
On Wed, 14 Dec 2011 08:36:44 +0100, Boris Zbarsky <bzbarsky@mit.edu> wrote:

> John Jensen here at Mozilla has been doing some web crawling trying to  
> find what barewords are used in on* attributes.


> What I have so far as a result is a list of about 1.7 million barewords  
> used across several tens of thousands of pages.

Do you have a more accurate figure for the number of pages?

> If people are interested in the exact methodology, I can probably get a  
> description.

I'm interested. It's hard to make conclusions from data without knowing  
what the data is, how it is biased, what false positives it might have,  

> I'm working on making sure that it's ok for me to post the data in its  
> entirety so you can all look as well.  Assuming it is (very likely),  
> where's a good place to stick a 7MB compressed file?
> In any case, for this particular data set there are no hits on "findAll"  
> or "matches" (good!), but there are two hits on "find" as a bareword in  
> an on* attribute.  Specifically:
> 1)  http://otc-pif.rbc.ru/pif_calculator/calculator.jsp has  
> onclick="find(document.getElementById(current + 'List').children,  
> searchString.value)"
> 2)  http://bookmark.people.com.cn/index.html has onclick="find()"
> These would both obviously get broken by the proposed find() API, unless  
> we actually do some sort of workaround for this problem...
> -Boris

Simon Pieters
Opera Software
Received on Wednesday, 14 December 2011 08:02:38 UTC

This archive was generated by hypermail 2.3.1 : Friday, 27 October 2017 07:26:37 UTC