Re: spam

Melvin Carvalho wrote:
>
>
> On 17 July 2010 17:00, Kingsley Idehen <kidehen@openlinksw.com 
> <mailto:kidehen@openlinksw.com>> wrote:
>
>     Nathan wrote:
>
>         So, after seeing this question on stack overflow...
>
>         '''Geiitng Adresses of Contact Us page of any web site
>         I want to capture address given on the contact us page. Is
>         there any php script to do so. I am struck coz of it. my
>         client want to store adresse of the web sites given on contact
>         us page. I am able to get content from contact us page. but i
>         am quite confuse how to get only address from this page.'''
>
>         .. and preceded by seeing 50-100+ projects advertised (daily)
>         to scrape contact details and create databases for spam
>         purposes on most freelance websites - I quickly come to the
>         realisation that with Linked Data, the tasks of these people
>         just became a whole lot easier, indeed the data is all MRD for
>         them and linked up to more.
>
>         Thus, in addition to nudging at general awareness of these
>         issues, I do wonder who (if anybody) is working on spam (and
>         unethical usage) solutions for the web of data?
>
>
> Agreed, it's quite easy to make a very effective spam filter using LOD
>
> Entity -> Spam Rank
>
> Unknown Entity -> Spam Rank = 0%
>
> Has WebID -> Spam Rank = 50%
>
> Has WebID in LOD (e.g. sindice) -> Spam Rank = 75%
>
> WebID is 3 links away from you -> Spam Rank = 85%
>
> WebID is 2 links away from you -> Spam Rank = 90%
>
> WebID is one link away from you -> Spam Rank = 95%
>
> It's not perfect, but you can go from zero to very good, in under a 
> day ...

Most important of all, the subjectivity of spam ranking is catered for 
when you have WebIDs and Linked Data in the mix.

One persons Spam is another's Ham :-)

Kingsley
>  
>
>
>         Best,
>
>         Nathan
>
>
>
>     Nathan,
>
>     SPAM busting is something Linked Data while handle very well. A
>     few moons ago when TimBL put out the GGG post [1], we had a little
>     experiment whereby you could only comment if you where at least
>     one degree of separation from an individual in his FOAF file. The
>     post has zillions of readers and not a single SPAM comment :-)
>     Sadly, the platform went down and the sole comment (mine) was lost :-(
>
>     The WebID protocol emergence marks the beginning of the end for
>     easy SPAM.
>
>     Make note of this re. WebID protocol usecases as we continue our
>     development of usecase collateral for the protocol :-)
>
>     Links:
>
>     1. http://dig.csail.mit.edu/breadcrumbs/node/215 -- this post used
>     to have a single comment, platform upgrade lost the comment
>
>     -- 
>
>     Regards,
>
>     Kingsley Idehen       President & CEO OpenLink Software     Web:
>     http://www.openlinksw.com
>     Weblog: http://www.openlinksw.com/blog/~kidehen
>     <http://www.openlinksw.com/blog/%7Ekidehen>
>     Twitter/Identi.ca: kidehen
>
>
>
>
>
>


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 

Received on Monday, 19 July 2010 12:11:13 UTC