Re: How to find the data I need in LD? algorithm and questions.

Hey all,

I'm working on generic Linked Data/SPARQL endpoint browser.
Provided with either RDF document URI or SPARQL endpoint URI, it
creates a web interface for it.
Here's some source code https://github.com/Graphity/graphity-browser
and a paper about the core framework (PHP version of it) here:
http://www.w3.org/2011/09/LinkedData/ledp2011_submission_1.pdf
I'd be happy to answer some questions.

Martynas
graphity.org

On Fri, Mar 16, 2012 at 9:15 PM, Hugh Glaser <hg@ecs.soton.ac.uk> wrote:
> Hi Yury
> Well I am sorry to see you have had no response, but it is not so surprising, really.
> You will find that essentially there are very few people doing what you are trying to do.
> The Semantic Web and Linked Data world is made up of people who publish, and rarely consume.
> It is almost unheard of for someone to consume someone else's data, unless they know the publisher.
> Everyone is shouting, but not many listening.
> OK, I might not be in a great mood today, but I'm not far wrong.
>
> To your problem.
> Your steps seem reasonable.
> I would, however, add the use of VoiD (http://www.w3.org/TR/void/, http://semanticweb.org/wiki/VoiD).
> VoiD is designed to deliver what you want, I think (if it doesn't, then it should be made to).
> Some sites do publish VoiD descriptions, and these can often be located automatically by looking in the sitemap, which can in turn be discovered by looking in robots.txt.
> Keith Alexander has a store of collected VoiD descriptions (http://kwijibo.talis.com/voiD/), as do we (http://void.rkbexplorer.com).
> I would also suggest that my own site, http://sameas.org might lead from interesting URIs to other related URIs, and hence interesting stores.
>
> Hope that helps.
> Best
> Hugh
>
> On 16 Mar 2012, at 04:58, Yury Katkov wrote:
>
>> Hi!
>>
>> What do you usually do when you want to find a dataset for your needs?
>> I'm preparing a tiny tutorial on this topic for the students and ask
>> you to share your experience.
>> My typical algorithm is the following:
>> 0) Define the topic. I have to know precisely what kind of data I need.
>> 1) Look at Linked Data cloud and other visualizations to ensure that
>> the needed data is presented somewhere. If for example I want to
>> improve Mendeley or Zotero I look at these visualizations and search
>> for publication data.
>> 2) Search the needed properties and classes with Sindice, Sig.ma and Swoogle.
>> 3) Look at CKAN description of the dataset, its XML citemap and VoiD metadata.
>> 4) explore the dataset that were found on the previous step with some
>> simple SPARQL queries like these:
>>
>> SELECT DISTINCT ?p WHERE {
>> ?s ?p ?o
>> }
>>
>> SELECT DISTINCT ?class WHERE {
>> { ?class a rdfs:Class . }
>> UNION
>> {?class a owl:Class . }
>> }
>>
>> SELECT DISCTINCT ?label WHERE {
>> {?a rdfs:label ?label}
>> UNION
>> {?a dc:title ?label}
>> /* and possibly some more things to search foaf:name's and so on */
>> }
>>
>> I can also use COUNTing and GROUPing BY to get some quick statistics
>> about the datasets.
>> 5) When I find some interesting URIs I use semantic web browsers
>> Marbles and Sig.ma to navigate through the dataset.
>> 5) Ask these smart guys in Semantic Web mailing list and Public LOD
>> mailing list. Probably go to semanticoverflow and ask for help there
>> as well
>> ======================
>> Here are my questions:
>>
>> 1) What else do you typically doing to find the dataset?
>> 2) Is there a resource where I can find the brief description of the
>> dataset in terms of properties and classes that mentioned there? And
>> these cool arrows in Richard Cyganiak's diagram: is there a resource
>> where I can find the information about relationship between the given
>> dataset and the rest of the world?
>> 3) I have similar algorithm for searching vocabularies. Can resources
>> like Schemapedia help me in searching the dataset?
>> 4) Do you know any other meeting SPARQL queries that can be handy when
>> I search something in the dataset.
>>
>> Sincerely yours,
>> -----
>> Yury Katkov
>>
>
> --
> Hugh Glaser,
>             Web and Internet Science
>             Electronics and Computer Science,
>             University of Southampton,
>             Southampton SO17 1BJ
> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
> Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
> http://www.ecs.soton.ac.uk/~hg/
>
>

Received on Friday, 16 March 2012 21:04:48 UTC