How to find the data I need in LD? algorithm and questions.

Hi!

What do you usually do when you want to find a dataset for your needs?
I'm preparing a tiny tutorial on this topic for the students and ask
you to share your experience.
My typical algorithm is the following:
0) Define the topic. I have to know precisely what kind of data I need.
1) Look at Linked Data cloud and other visualizations to ensure that
the needed data is presented somewhere. If for example I want to
improve Mendeley or Zotero I look at these visualizations and search
for publication data.
2) Search the needed properties and classes with Sindice, Sig.ma and Swoogle.
3) Look at CKAN description of the dataset, its XML citemap and VoiD metadata.
4) explore the dataset that were found on the previous step with some
simple SPARQL queries like these:

SELECT DISTINCT ?p WHERE {
?s ?p ?o
}

SELECT DISTINCT ?class WHERE {
{ ?class a rdfs:Class . }
UNION
{?class a owl:Class . }
}

SELECT DISCTINCT ?label WHERE {
{?a rdfs:label ?label}
UNION
{?a dc:title ?label}
/* and possibly some more things to search foaf:name's and so on */
}

I can also use COUNTing and GROUPing BY to get some quick statistics
about the datasets.
5) When I find some interesting URIs I use semantic web browsers
Marbles and Sig.ma to navigate through the dataset.
5) Ask these smart guys in Semantic Web mailing list and Public LOD
mailing list. Probably go to semanticoverflow and ask for help there
as well
======================
Here are my questions:

1) What else do you typically doing to find the dataset?
2) Is there a resource where I can find the brief description of the
dataset in terms of properties and classes that mentioned there? And
these cool arrows in Richard Cyganiak's diagram: is there a resource
where I can find the information about relationship between the given
dataset and the rest of the world?
3) I have similar algorithm for searching vocabularies. Can resources
like Schemapedia help me in searching the dataset?
4) Do you know any other meeting SPARQL queries that can be handy when
I search something in the dataset.

Sincerely yours,
-----
Yury Katkov

Received on Friday, 16 March 2012 04:59:05 UTC