W3C home > Mailing lists > Public > semantic-web@w3.org > October 2010

Re: overstock.com adds GoodRelations in RDFa to 900,000 item pages

From: Martin Hepp <martin.hepp@ebusiness-unibw.org>
Date: Thu, 7 Oct 2010 21:13:32 +0200
Cc: Karl Dubost <karl+w3c@la-grange.net>, public-lod@w3.org, semantic-web@w3.org
Message-Id: <4609A371-266B-4A88-95C9-BC9A90EA61A7@ebusiness-unibw.org>
To: Thomas Steiner <tomac@google.com>
Hi Tom,
taking our thread to the public is definitely good.

My point is that, yes, it is tempting to tweak your invisible data to  
polish your ranking.

Example: In HTML, you say the price is 100 USD, in RDFa you say it was  
50 USD.

However, it is - from a computational perspective - very, very easy  
for Google or anybody else to spot divergences between the visible and  
the invisible content and punish pages that use such "semantic black- 
hat SEO".

My main argument is that structured data also simplifies checking for  
black-hat SEO.

A very simple algorithm would be to tag pages that don't contain the  
sequence of digits contained in an invisible r:hasCurrencyValue  
property anywhere in the visible part of the page.

True algorithms for such checks will be more complex, but I hope this  
gives a hint.

Bottomline: I think the fraud issue is overrated, since Google, Bing,  
and Yahoo have the pretty strong instrument of delisting sites that  
use black-hat SEO.


Martin

On 07.10.2010, at 17:05, Thomas Steiner wrote:

> Hi Martin,
>
> We have discussed this off-list before, but maybe others would like to
> chime in...
>
>> I don't think it is sad; because using invisible div / span  
>> elements nicely
>> decouple the organization of the visual content from the embedded  
>> data.
>
> Martin, you never fail to hash-mark your #GoodRelations tweets with
> #SEO. Decoupling triples and content raises an interesting SEO
> problem: state A in the visible content, state B in the invisible
> triples. Now which information do we trust? It's the "white text on a
> white background" search engine fooling of the 21st century. I'm not
> yet sure if it's a real problem, but could imagine that "tweaking"
> price tags might be tempting to some. Opinions?
>
> Thanks,
> Tom
>
> Disclaimer: I work for Google, but I have no insider information at
> all how/if we deal with this.
>
> -- 
> Thomas Steiner, Research Scientist, Google Inc.
> http://blog.tomayac.com, http://twitter.com/tomayac
>
Received on Thursday, 7 October 2010 19:47:10 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:20 UTC