- From: Kingsley Idehen <kidehen@openlinksw.com>
- Date: Mon, 19 Jul 2010 08:10:41 -0400
- To: Melvin Carvalho <melvincarvalho@gmail.com>
- CC: nathan@webr3.org, Linked Data community <public-lod@w3.org>
Melvin Carvalho wrote: > > > On 17 July 2010 17:00, Kingsley Idehen <kidehen@openlinksw.com > <mailto:kidehen@openlinksw.com>> wrote: > > Nathan wrote: > > So, after seeing this question on stack overflow... > > '''Geiitng Adresses of Contact Us page of any web site > I want to capture address given on the contact us page. Is > there any php script to do so. I am struck coz of it. my > client want to store adresse of the web sites given on contact > us page. I am able to get content from contact us page. but i > am quite confuse how to get only address from this page.''' > > .. and preceded by seeing 50-100+ projects advertised (daily) > to scrape contact details and create databases for spam > purposes on most freelance websites - I quickly come to the > realisation that with Linked Data, the tasks of these people > just became a whole lot easier, indeed the data is all MRD for > them and linked up to more. > > Thus, in addition to nudging at general awareness of these > issues, I do wonder who (if anybody) is working on spam (and > unethical usage) solutions for the web of data? > > > Agreed, it's quite easy to make a very effective spam filter using LOD > > Entity -> Spam Rank > > Unknown Entity -> Spam Rank = 0% > > Has WebID -> Spam Rank = 50% > > Has WebID in LOD (e.g. sindice) -> Spam Rank = 75% > > WebID is 3 links away from you -> Spam Rank = 85% > > WebID is 2 links away from you -> Spam Rank = 90% > > WebID is one link away from you -> Spam Rank = 95% > > It's not perfect, but you can go from zero to very good, in under a > day ... Most important of all, the subjectivity of spam ranking is catered for when you have WebIDs and Linked Data in the mix. One persons Spam is another's Ham :-) Kingsley > > > > Best, > > Nathan > > > > Nathan, > > SPAM busting is something Linked Data while handle very well. A > few moons ago when TimBL put out the GGG post [1], we had a little > experiment whereby you could only comment if you where at least > one degree of separation from an individual in his FOAF file. The > post has zillions of readers and not a single SPAM comment :-) > Sadly, the platform went down and the sole comment (mine) was lost :-( > > The WebID protocol emergence marks the beginning of the end for > easy SPAM. > > Make note of this re. WebID protocol usecases as we continue our > development of usecase collateral for the protocol :-) > > Links: > > 1. http://dig.csail.mit.edu/breadcrumbs/node/215 -- this post used > to have a single comment, platform upgrade lost the comment > > -- > > Regards, > > Kingsley Idehen President & CEO OpenLink Software Web: > http://www.openlinksw.com > Weblog: http://www.openlinksw.com/blog/~kidehen > <http://www.openlinksw.com/blog/%7Ekidehen> > Twitter/Identi.ca: kidehen > > > > > > -- Regards, Kingsley Idehen President & CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter/Identi.ca: kidehen
Received on Monday, 19 July 2010 12:11:13 UTC