W3C home > Mailing lists > Public > semantic-web@w3.org > January 2012

Re: [Semediawiki-user] status and problems on sematicweb.org

From: Yury Katkov <katkov.juriy@gmail.com>
Date: Fri, 13 Jan 2012 12:06:26 +0400
Message-ID: <CAAT7DEHLVZ3aUhLt2f4-1Qo2gP2XP=zTgGWJKD4CmyVm0_9MLg@mail.gmail.com>
To: Lonny <lonny@appropedia.org>
Cc: Markus Krötzsch <markus.kroetzsch@cs.ox.ac.uk>, Semantic Web <semantic-web@w3.org>
Hi everyone!

Thanks for your attention, it's great to see that people really care about
the subject.

Based on own experience with my wikis and grandiose experince of
Wikipedia's and Wikia's guys I can tell that MediaWiki has great tools for
combating spam. These tools range from external links blacklist to machine
learning spam and vandalism detectors on wikibots [1,2,3,7].

1) 10 hours waiting period is very effective against spambots whose typical
behavior is to register and immediately write something
2) Non-standard sign up form with additional required field sometimes
works: we have Semantic Signup and beta of Social Profile extensions for
that.
3) add 'nofollow' attribute to all external links once worked extremely
well on Wikipedia, but it maybe not a good idea in our case since
semanticweb services, papers and projects can and have to be promoted with
semanticweb.org wiki.
4) this guy [4] coupled with SpamBlacklist [5] and this one [6] can help to
clean up the wiki.
5) I haven't tried AbuseFilter yet but also heard that it's effective.

User rights tuning can also help:

1) Allow blocking and maybe deleting privileges to real users. This may be
a cause for more people to get involved.
2) There is no existing way to restrict the URL creation for various group
of users (for example, deny inserting external links to anonymous users and
users that haven't confirm their e-mail) but it doesn't seem that the that
it's hard to write an extension for that.


References:

[1] http://www.mediawiki.org/wiki/Manual:Combating_spam
[2] http://www.mediawiki.org/wiki/Anti-spam_features
[3] http://www.mediawiki.org/wiki/Spam_Filter
[4]
http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/SpamBlacklist/cleanup.php
[5] http://www.mediawiki.org/wiki/Extension:SpamBlacklist
[6] https://github.com/dannyob/secretaribot
[7] http://help.wikia.com/wiki/Help:Spam

Sincerely yours,
Yury Katkov




On Fri, Jan 13, 2012 at 6:26 AM, Lonny <lonny@appropedia.org> wrote:

> Hi All,
>
> Appropedia has found that the following steps have stopped the vast
> majority of our previously incessant spam:
> * requiring captcha for anons coupled
> * a 1 edit, 10 hour waiting period for new users to post external links
> * a few settings in AbuseFilter
>
> Please let me know if you would like more details of our current spam
> fighting setup.
>
> Good luck,
> Lonny
>
> PS as you know, you can see the extensions at
> http://www.appropedia.org/Special:Version
>  On Jan 12, 2012 11:36 AM, "Markus Krötzsch" <markus.kroetzsch@cs.ox.ac.uk>
> wrote:
>
>> Hi Yuri,
>>
>> let us take this to one mailing list semantic-web@w3.org, as this is the
>> list that is most involved (please drop the others when you reply).
>>
>> As the technical maintainer of the site, I largely agree with your
>> assessment. In spite of the very high visibility of the site (and
>> perceived authority), the active editing community is not big. This is a
>> problem especially given the significant and continued spam attacks that
>> the site is under due to its high visibility (I just recently changed
>> the captcha system and rolled back thousands of edits, yet it seems they
>> are already breaking through again, though in smaller numbers).
>>
>> I do not want to blame anybody for the state of affairs: most of us do
>> not have the time to contribute significant content to such sites.
>> However, given the extraordinary visibility of the site, we should all
>> perceive this as a major problem (to the extent that we attach our work
>> to the label "semantic web" in any way).
>>
>> So what can be done?
>>
>> (1) Freeze the wiki. A weaker version of this is: allow users only to
>> edit after they were manually added to a group of trusted users (all
>> humans welcome). This would require somebody to manage these permissions
>> but would allow existing projects/communities to continue to use the site.
>>
>> (2) Re-enforce spam protection on the wiki. Maybe this could be done,
>> but the site is targeted pretty heavily. Standard captchas like
>> ReCaptcha are thus getting broken (spammers do have an effective
>> infrastructure for this), but maybe non-standard captchas could work
>> better. This is a task for the technical maintainers (i.e., me and the
>> folks at AIFB Karlsruhe where the site is hosted).
>>
>> (3) Clean the wiki. Whether frozen or not, there is a lot of spam
>> already. Something needs to be done to get rid of it. This requires
>> (easy but tedious) manual effort. Some stakeholders need to be found to
>> provide basic workforce (e.g., by hiring a student to help with spam
>> deletion).
>>
>> (4) Restore the wiki. Update the main pages (about technologies and
>> active projects) to reflect a current and/or timeless state that we
>> would like new readers to see. This again needs somebody to push it, and
>> for writing pages about topics like SPARQL one would need some
>> expertise. This is a challenge for the community.
>>
>> I am willing to invest /some/ time here to help with the above, but (3)
>> and (4) requires support from more people. On the other hand, there are
>> probably hardly more than 20 or 30 *essential* content pages that we are
>> talking about here, plus many pages about projects and people that one
>> should ask the stakeholders to review. So one might be able to make this
>> into a shining entry point to the semantic web in a week of work ...
>> together with (1) and (2) above, the invested work would remain valuable
>> for a long time.
>>
>> Cheers
>>
>> Markus
>>
>>
>>
>> On 12/01/12 10:43, Yury Katkov wrote:
>> > Hi everyone!
>> >
>> > What is the current status of the semanticweb.org
>> > <http://semanticweb.org> website? It used to be the main wiki about the
>> > semantic web, it has a lot of cool and useful information about
>> > everything. But now it seems abandoned. I mean, there are about 30 real
>> > writers who update the information about their projects an write
>> > articles, but they do something like 30% of changes. The other 70% is
>> spam!
>> >
>> > Are there guys who support the website?
>> > Who manages the community, are there any plans of creating projects and
>> > articles about SW? Is there community at all?
>> >
>> > In my opinion if this great website suppose to be alive the first goal
>> > is to find volunteers who'll help administrator to combat spam (with
>> > bots, extensions and editing policies) and support the new activities
>> > and projets on the wiki. (I'm ready to be one of them).
>> > If this wiki lived only in the past when it was a big hype around
>> > Semantic Web topics and now without a big funding nobody wants to use it
>> > - wouldn't it better to be frozen?
>> >
>> > I appreciate and admire people who started up the wiki. Please, don't
>> > let it be the rotting memorial to the past of the Semantic Web.
>> > -----
>> > Sincerely yours,
>> > Yury Katkov, WikiVote llc
>> >
>> >
>>
>>
>> --
>> Dr. Markus Kroetzsch
>> Department of Computer Science, University of Oxford
>> Room 306, Parks Road, OX1 3QD Oxford, United Kingdom
>> +44 (0)1865 283529               http://korrekt.org/
>>
>>
>> ------------------------------------------------------------------------------
>> RSA(R) Conference 2012
>> Mar 27 - Feb 2
>> Save $400 by Jan. 27
>> Register now!
>> http://p.sf.net/sfu/rsa-sfdev2dev2
>> _______________________________________________
>> Semediawiki-user mailing list
>> Semediawiki-user@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/semediawiki-user
>>
>
Received on Friday, 13 January 2012 11:56:04 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:46 GMT