Re: spam

On 17 July 2010 17:00, Kingsley Idehen <kidehen@openlinksw.com> wrote:

> Nathan wrote:
>
>> So, after seeing this question on stack overflow...
>>
>> '''Geiitng Adresses of Contact Us page of any web site
>> I want to capture address given on the contact us page. Is there any php
>> script to do so. I am struck coz of it. my client want to store adresse of
>> the web sites given on contact us page. I am able to get content from
>> contact us page. but i am quite confuse how to get only address from this
>> page.'''
>>
>> .. and preceded by seeing 50-100+ projects advertised (daily) to scrape
>> contact details and create databases for spam purposes on most freelance
>> websites - I quickly come to the realisation that with Linked Data, the
>> tasks of these people just became a whole lot easier, indeed the data is all
>> MRD for them and linked up to more.
>>
>> Thus, in addition to nudging at general awareness of these issues, I do
>> wonder who (if anybody) is working on spam (and unethical usage) solutions
>> for the web of data?
>>
>
Agreed, it's quite easy to make a very effective spam filter using LOD

Entity -> Spam Rank

Unknown Entity -> Spam Rank = 0%

Has WebID -> Spam Rank = 50%

Has WebID in LOD (e.g. sindice) -> Spam Rank = 75%

WebID is 3 links away from you -> Spam Rank = 85%

WebID is 2 links away from you -> Spam Rank = 90%

WebID is one link away from you -> Spam Rank = 95%

It's not perfect, but you can go from zero to very good, in under a day ...


>
>> Best,
>>
>> Nathan
>>
>>
>>
> Nathan,
>
> SPAM busting is something Linked Data while handle very well. A few moons
> ago when TimBL put out the GGG post [1], we had a little experiment whereby
> you could only comment if you where at least one degree of separation from
> an individual in his FOAF file. The post has zillions of readers and not a
> single SPAM comment :-) Sadly, the platform went down and the sole comment
> (mine) was lost :-(
>
> The WebID protocol emergence marks the beginning of the end for easy SPAM.
>
> Make note of this re. WebID protocol usecases as we continue our
> development of usecase collateral for the protocol :-)
>
> Links:
>
> 1. http://dig.csail.mit.edu/breadcrumbs/node/215 -- this post used to have
> a single comment, platform upgrade lost the comment
>
> --
>
> Regards,
>
> Kingsley Idehen       President & CEO OpenLink Software     Web:
> http://www.openlinksw.com
> Weblog: http://www.openlinksw.com/blog/~kidehen<http://www.openlinksw.com/blog/%7Ekidehen>
> Twitter/Identi.ca: kidehen
>
>
>
>
>
>

Received on Monday, 19 July 2010 11:30:22 UTC