RE: How do you explore a SPARQL Endpoint? from Mark Wallace on 2015-01-22 (semantic-web@w3.org from January 2015)

From: Mark Wallace <mwallace@modusoperandi.com>
Date: Thu, 22 Jan 2015 18:56:33 +0000
To: Nandana Mihindukulasooriya <nmihindu@fi.upm.es>, Bernard Vatant <bernard.vatant@mondeca.com>
CC: Semantic Web <semantic-web@w3.org>, public-lod public <public-lod@w3.org>
Message-ID: <DM2PR08MB398BA60B7490DCFC60D8632C3490@DM2PR08MB398.namprd08.prod.outlook.com>

Agree with most of these as good exploratory queries, but be very careful of using the bottom-up approaches on endpoints that you don’t own.   These can be very hard on the endpoint’s triple store.  I like Bernard Vatant’s approach of trying to find a declared data model first.  If it’s there, such queries should not tax the endpoint’s triple store; much less so than large table scans that the other ones can trigger.

--
Mark Wallace
Principal Engineer, Semantic Applications
MODUS OPERANDI, INC.


From: Nandana Mihindukulasooriya [mailto:nmihindu@fi.upm.es]
Sent: Thursday, January 22, 2015 11:01 AM
To: Bernard Vatant
Cc: Semantic Web; public-lod public
Subject: Re: How do you explore a SPARQL Endpoint?

May be not just looking at the classes and properties but looking at their frequencies using counts can give a better idea about what sort of data is exposed. If there is a Void information it certainly helps. Tools such as http://data.aalto.fi/visu also help. Similar approach described here [1] .

Best Regards,
Nandana

[1] - http://ceur-ws.org/Vol-782/PresuttiEtAl_COLD2011.pdf


On Thu, Jan 22, 2015 at 4:25 PM, Bernard Vatant <bernard.vatant@mondeca.com<mailto:bernard.vatant@mondeca.com>> wrote:
Interesting to note that the answers so far are converging towards looking first for types and predicates, but bottom-up from the data, and not queries looking for a declared model layer using RDFS or OWL, such as e.g.,
SELECT DISTINCT ?class
WHERE { {?class a owl:Class} UNION {?class a rdfs:Class}}
SELECT DISTINCT ?property ?domain ?range
WHERE { {?property rdfs:domain ?domain} UNION {?property rdfs:range ?range}}
Which means globally you don't think the SPARQL endpoint will expose a formal model along with the data.
That said, if the model is exposed with the data, the values of rdf:type will contain e.g., rdfs:Class and owl:Class ...
Of course in the ideal situation where you have an ontology, the following would bring its elements.
SELECT DISTINCT ?o ?x ?type
WHERE {?x rdf:type ?type.
                ?x rdfs:isDefinedBy ?o.
                ?o a owl:Ontology }
It's worth trying, because if the dataset you query is really big, it will be faster to look first for a declared model than asking all distinct rdf:type


2015-01-22 15:23 GMT+01:00 Alfredo Serafini <seralf@gmail.com<mailto:seralf@gmail.com>>:
Hi

the most basic query is the usual query for concepts, something like:

SELECT DISTINCT ?concept
WHERE {
?uri a ?concept.
}

then, given a specific concept, you  can infer from the data what are the predicates/properties for it:
SELECT DISTINCT ?prp
WHERE {
[] ?prp <a-concept>.
}

and so on...

Apart from other more complex query (here we are of course omitting a lot of important things), these two "patterns" are usually the most useful as a starting point, for me.



2015-01-22 15:09 GMT+01:00 Juan Sequeda <juanfederico@gmail.com<mailto:juanfederico@gmail.com>>:
Assume you are given a URL for a SPARQL endpoint. You have no idea what data is being exposed.

What do you do to explore that endpoint? What queries do you write?

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com<http://www.juansequeda.com>



--
Bernard Vatant
Vocabularies & Data Engineering
Tel :  + 33 (0)9 71 48 84 59
Skype : bernard.vatant
http://google.com/+BernardVatant

--------------------------------------------------------
Mondeca
35 boulevard de Strasbourg 75010 Paris
www.mondeca.com<http://www.mondeca.com/>
Follow us on Twitter : @mondecanews<http://twitter.com/#%21/mondecanews>
----------------------------------------------------------

Received on Thursday, 22 January 2015 18:57:12 UTC