- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 9 Jan 2009 23:37:30 +0000 (UTC)
On Fri, 9 Jan 2009, Ben Adida wrote: > > Is inherent resistance to spam a condition (even a consideration) for > HTML5? We have to make sure that whatever we specify in HTML5 actually is going to be useful for the purpose it is intended for. If a feature intended for wide-scale automated data extraction is especially susceptible to spamming attacks, then it is unlikely to be useful for wide-scale automated data extraction. > If so, where is the concern around <title>, which is clearly featured in > search engine results? Nobody is suggesting that user agents derive any behavior from <title>, so it doesn't matter if <title> is spammed or not. The only effect would be some spam in the user's session history. Furthermore, <title> is page- wide, meaning that the actual page author would have to spam the page for it to be spamed. It is less likely for a user to intentionally visit a spammy page than for a user to visit a page that happens to contain spammy content embedded within it (e.g. in blog comments). If browsers were expected to crawl all pages for all links and then populate the browser's interface with the most popular links, then one would quickly expect everyone's browsers to be advertising Viagra, porn sites, and the like. However, browsers don't do this kind of processing -- indeed, this kind of processing appears to be exactly what RDFa proponents are trying to enable (though to what end, I'm still trying to find out, since nobody has actually replied to all the questions I asked yet [1]). Note that search engines aren't the problem here -- large operations like search engines are quite capable of running the massive processing required to filter spam. The problem is automated processing on the client, where those resources aren't available. [1] http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-December/018023.html -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 9 January 2009 15:37:30 UTC