Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages

Hi Tom,
taking our thread to the public is definitely good.

My point is that, yes, it is tempting to tweak your invisible data to  
polish your ranking.

Example: In HTML, you say the price is 100 USD, in RDFa you say it was  
50 USD.

However, it is - from a computational perspective - very, very easy  
for Google or anybody else to spot divergences between the visible and  
the invisible content and punish pages that use such "semantic black- 
hat SEO".

My main argument is that structured data also simplifies checking for  
black-hat SEO.

A very simple algorithm would be to tag pages that don't contain the  
sequence of digits contained in an invisible r:hasCurrencyValue  
property anywhere in the visible part of the page.

True algorithms for such checks will be more complex, but I hope this  
gives a hint.

Bottomline: I think the fraud issue is overrated, since Google, Bing,  
and Yahoo have the pretty strong instrument of delisting sites that  
use black-hat SEO.


Martin

On 07.10.2010, at 17:05, Thomas Steiner wrote:

> Hi Martin,
>
> We have discussed this off-list before, but maybe others would like to
> chime in...
>
>> I don't think it is sad; because using invisible div / span  
>> elements nicely
>> decouple the organization of the visual content from the embedded  
>> data.
>
> Martin, you never fail to hash-mark your #GoodRelations tweets with
> #SEO. Decoupling triples and content raises an interesting SEO
> problem: state A in the visible content, state B in the invisible
> triples. Now which information do we trust? It's the "white text on a
> white background" search engine fooling of the 21st century. I'm not
> yet sure if it's a real problem, but could imagine that "tweaking"
> price tags might be tempting to some. Opinions?
>
> Thanks,
> Tom
>
> Disclaimer: I work for Google, but I have no insider information at
> all how/if we deal with this.
>
> -- 
> Thomas Steiner, Research Scientist, Google Inc.
> http://blog.tomayac.com, http://twitter.com/tomayac
>

Received on Thursday, 7 October 2010 19:47:17 UTC