Looking for pedagogically useful data sets

Hello all,

      I am looking for some RDF data sets to use in a short presentation on
RDF and SPARQL.  I want to do a short demo,  and since RDF and SPARQL will
be new to this audience,  I was hoping for something where the predicates
would be easy to understand.

     I was hoping that the LOGD data from RPI/TWC would be suitable,  but
once I found the old web site (the new one is down) and manually fixed the
broken download link I found the predicates were like

<http://data-gov.tw.rpi.edu/vocab/p/1525/v96>

and the only documentation I could find for them (maybe I wasn't looking in
the right place) was that this predicate has an rdf:label of "V96".)

Note that an alpha+numeric code is good enough for Wikidata and it is
certainly concise,  but I don't want :v96 to be the first things that these
people see.

Something I like about this particular data set is that it is about 1
million triples which is big enough to be interesting but also small enough
that I can load it in a few seconds,  so that performance issues are not a
distraction.

The vocabulary in DBpedia is closer to what I want (and if I write the
queries most of the distracting things about vocab are a non-issue) but
then data quality issues are the distraction.

So what I am looking for is something around 1 m triples in size (in terms
of order-of-magnitude) and where there are no distractions due to obtuse
vocabulary or data quality issues.  It would be exceptionally cool if there
were two data sets that fit the bill and I could load them into the triple
store together to demonstrate "mashability"

Any suggestions?

-- 
Paul Houle
(607) 539 6254    paul.houle on Skype   ontology2@gmail.com
http://legalentityidentifier.info/lei/lookup

Received on Wednesday, 11 March 2015 23:22:15 UTC