Re: How do you explore a SPARQL Endpoint? from Kingsley Idehen on 2015-01-23 (public-lod@w3.org from January 2015)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Fri, 23 Jan 2015 09:05:48 -0500
To: public-lod@w3.org
Message-ID: <54C2553C.3080600@openlinksw.com>
On 1/23/15 4:37 AM, Pavel Klinov wrote:
> Alright, so this isn't an answer and I might be saying something
> totally silly (since I'm not a Linked Data person, really).
>
> If I re-phrase this question as the following: "how do I extract a
> schema from a SPARQL endpoint?", then it seems to pop up quite often
> (see, e.g., [1]). I understand that the original question is a bit
> more general but it's fair to say that knowing the schema is a huge
> help for writing meaningful queries.
>
> As an outsider, I'm quite surprised that there's still no commonly
> accepted (i'm avoiding "standard" here) way of doing this. People
> either hope that something like VoID or LOV vocabularies are being
> used, or use 3-party tools, or write all sorts of ad hoc SPARQL
> queries themselves, looking for types, object properties,
> domains/ranges etc-etc. There are also papers written on this subject.
>
> At the same time, the database engines which host datasets often (not
> always) manage the schema separately from the data. There're good
> reasons for that. One reason, for example, is to be able to support
> basic reasoning over the data, or integrity validation. Just because
> in RDF the schema language and the data language are the same, so
> schema and data triples can be interleaved, it need not (and often
> not) be managed that way.
>
> Yet, there's no standard way of requesting the schema from the
> endpoint, and I don't quite understand why. There's the SPARQL 1.1
> Service Description, which could, in theory, cover it, but it doesn't.
> Servicing such schema extraction requests doesn't have to be mandatory
> so the endpoints which don't have their schemas right there don't have
> to sift through the data. Also, schemas are typically quite small.
>
> I guess there's some problem with this which I'm missing...

To cut a long story short, you are seeking an experience from one realm 
(SQL RDBMS Relational Tables) in another (RDF Relational 
Property/Predicate Graphs).

I'll try to break this issue down a little, as this problem has 
everything to do with poor and deteriorating narratives in regards to 
the nature of:

1. RDBMS Applications
2. Database Documents
3. Relations.

First off, an RDBMS [1] and a Database [2] are two distinct things. 
Contrary to the marketing-driven misinformation from SQL RDBMS [3] 
vendors (spanning 20 years), a Database is a Document. It isn't a 
conflation of RDBMS application (which provides interaction services) 
and Database Documents.

A Database is a document comprised of Data.

Data is basically sets of tuples (values) representing entity 
relationships that are grouped by relationship types (a/k/a relations).

Relations can be represented as Records in a Table which is what you 
have in a SQL RDBMS. They can also be represented as Property/Predicate 
graphs.

A predicate is a sentence-forming-relation [4]. This is basically what 
RDF is all about, hence the special role of rdf:Property [5] in this 
particular Language (system of signs, syntax, and 
entity-relationship-role-semantics -- for encoding and decoding 
information [data in some context] ).

So back to your fundamental quest, you want to interrogate an RDF RDBMS 
via SPARQL queries. That quest boils down to the following:

1. systematically determining the nature of entity relationships 
represented by RDF relations -- managed by an given RDBMS instance
2. using information obtained from step 1 to find instances of items of 
interest (as already outlined by Bernard Vatant's response [6] ).

I hope this helps.  In my strong personal opinion, SQL RDBMS vendor 
marketing has actually done the world a disservice over the years. 
Luckily, the emergence of the World Wide Web -- and the Linked Open Data 
cloud its facilitated -- lays the foundation for fixing the 
aforementioned disservice. Naturally, we'll have to live with marketing 
gobbledegook like "Big Data" and other silliness for a while, but the 
truth (in the form of facts) will eventually bubble to the top!


Links:

[1] 
http://www.openlinksw.com/data/turtle/general/GlossaryOfTerms.ttl#RDBMS 
-- RDBMS
[2] 
http://www.openlinksw.com/data/turtle/general/GlossaryOfTerms.ttl#Database 
-- Database
[3] 
http://www.openlinksw.com/data/turtle/general/GlossaryOfTerms.ttl#SQLDBMS -- 
SQL RDBMS
[4] 
http://54.183.42.206:8080/sigma/Browse.jsp?lang=EnglishLanguage&flang=SUO-KIF&kb=SUMO&term=Predicate 
-- Predicate
[5] http://linkeddata.uriburner.com/c/8CCIWQ -- RDF Property
[6] https://lists.w3.org/Archives/Public/public-lod/2015Jan/0105.html -- 
Bernard Vatant response in prior thread
[7] 
http://kidehen.blogspot.com/2015/01/loosely-coupling-database-document.html 
-- Part 1 of a post I am working on about Data that starts with a CSV 
document.

-- 
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Friday, 23 January 2015 14:06:11 UTC