Re: Should dbpedia have stuff in that is not from wikipedia - was: Re: A URI(Web ID) for the semantic web community as a foaf:Group

From: Dan Brickley <danbri@danbri.org>
Date: Sat, 27 Mar 2010 12:02:51 +0100
Message-ID: <eb19f3361003270402n29852adqd0e33f7d8613de55@mail.gmail.com>
To: Hugh Glaser <hg@ecs.soton.ac.uk>
Cc: Kingsley Idehen <kidehen@openlinksw.com>, Tom Heath <tom.heath@talis.com>, "KangHao Lu (Kenny)" <kennyluck@csail.mit.edu>, "public-lod@w3.org" <public-lod@w3.org>

Couple of almost-independent points -

Re DBpedia, I share a concern that the "Wikipedia turned into a
database" product remain fairly clearly defined, even though the
RDFization naturally includes a bit of creativity. However even that
has subtleties - there are the different language variants for
example, plus outlying members of the Wikipedia family (wiktionary

However I think we as a community should be prepared for an
interesting trend, hopefully one that'll move faster with things like
openid and RDF helping: I believe Wiki federation and
cross-referencing will become a major trend over next few years. The
stress and trauma that the Wikipedia community are currently feeling
re scoping, ie. the Deletionism debate -
http://meta.wikimedia.org/wiki/Deletionism -  can only really be
resolved by accepting that we'll have a Web of useful and overlapping
wikis, treating various topics in more or less detail. Using common
URIs (grounded in the central Wikipedia) makes this possible. And this
means - by combining dbpedia's extraction technology, or the Semantic
MediaWiki addons, that we can expect a lot more RDF data from other
wikis over the coming years. It wouldn't be unreasonable for the
DBpedia project to offer some aggregate of all this, if they chose

Also re SWIG, considered as a entity in the W3C world and as a larger
vaguer community. Some W3C Interest Groups have enumerated
memberships; traditionally RDF IG and its successor, this SemWeb IG,
didn't. There is no master list, just a collection of SWIG-related
mailing lists and other channels. I wonder sometimes about changing
that, so we had a stronger sense of who the members of W3C SWIG
actually are (ie. who has commited to the group's charter; also
db-backed profile pages at w3.org, etc.). There are also data sources
like the mail archives and #swig IRC logs (see
http://swig.xmlhack.com/), Twitter/Identi.ca etc that offer some sense
of who the active members of the community are. Also I made some
experiments in http://danbri.org/words/2009/10/25/504 with exposing
lists of OpenIDs from Wordpress, MediaWiki etc to show who is actively
participating at some site. I think this evidence-driven approach is a
stronger way of defining a network of overlapping foaf:Group
descriptions, rather than having a single central list. I might for
example want to see who was on the www-rdf-logic or www-rdf-rules
lists and via their microblog posts, which amongst them were in the
Netherlands. Or find microblog posts from the people who are actively
contributing to the FOAF or ESW wikis.

There are lots of overlapping communities; being 'in the Semantic Web
community' isn't a simple boolean flag. So I'd rather surface the
underlying data and allow people to compose views into it that suit
particular use cases - "find me things bookmarked by ontologists";
"what have members of public-lod been saying on Twitter this week?",
"Find me DOAP descriptions of software associated with members of the
#swig IRC channel", "conferences with 2 or more editors of W3C SemWeb
specs on the steering committee", etc etc...

To relate these two points, I have started documenting bits of SemWeb
history in the FOAF Wiki, since I really can't be bothered to fight
deletionism wars on Wikipedia's main site.

For example http://wiki.foaf-project.org/w/MCF describes Meta Content
Format (and yep the CSS image right alignment has gone wrong there -
help welcomed!). The FOAF wiki has OpenID support, and Semantic Media
Wiki installed, so edits can be associated with OpenIDs. I would love
to know how best to configure SMW so that we could figure out that
http://wiki.foaf-project.org/w/MCF is talking about the same thing as
http://en.wikipedia.org/wiki/Meta_Content_Framework so that folk who
express their interest the topic using either URI can be linked.
What's the markup to put into the FOAF wiki entry which would express
the appropriate sameAs?

Also of note, the FOAF Wiki is currently configured to consume a list
of OpenIDs and add them to a MediaWiki trust group, "Bureaucrat".
http://wiki.foaf-project.org/w/FOAF_Wiki:Bureaucrats ... it currently
gets this list just from my blog, ie. anyone who I have trusted enough
to comment in my blog, gets added to this group. In future I would
like to tune this to use more sources and more subtlety. Getting this
kind of trust syndication in place I think will be a big part of
helping smaller Wikis flourish, to connect back to the original


