Re: W3C Data Cube spec uses SPARQL too

Dave,

I think your experience with SPARQL is very informative and 
representative. The benefits of using SPARQL are that it is a W3C 
standard, has well-defined semantics, and is executable. These 
characteristics make it a good foundation for other specifications since 
they enable the faster development of reference implementations. Reference 
implementations must be correct, but they are not expected to either 
perform well or to scale to large data volumes. Of course, specifications 
should also take into consideration performance and scalability 
requirements, otherwise they may not be much practical importance. So as 
long as a specification has an acceptable computational complexity, the 
work of developing high-performance, scalable implementations becomes a 
software engineering exercise.

Regards, 
___________________________________________________________________________
Arthur Ryman, PhD

Chief Data Officer, Rational
Chief Architect, Portfolio & Strategy Management
Distinguished Engineer | Master Inventor | Academy of Technology

Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile)





From:   Dave Reynolds <dave.e.reynolds@gmail.com>
To:     public-rdf-shapes@w3.org, 
Date:   08/06/2014 06:41 AM
Subject:        Re: W3C Data Cube spec uses SPARQL too



Actually, as I noted in my talk at the validation workshop, using SPARQL 
here was a mixed blessing.

On the plus side:

1. We were able to express the checks that we wanted to express.

2. Spec users are able to execute the checks using existing 
standards-compliant SPARQL engines (mostly, see below).

On the minus side:

a. Strictly SPARQL wasn't sufficient on its own without some surrounding 
control machinery. Rules IC-20 and IC-21 are SPARQL templates - you run 
one query then for each result from that query you instantiate the 
template, then run all the resulting template instances.

b. This is not a human readable way to communicate the constraints. Even 
the simple cases require thought to interpret, IC-12 and IC-17 are 
seriously confusing [1].

c. There are some performance issues. The no-duplicates check IC-12 is 
quadratic (in the number of observations) as expressed in SPARQL and 
problematic for big data sets. Actual cube validators (at least one the 
one the GLD WG put up) use custom code to implement this check near 
linearly, they don't use the SPARQL.


Closed world OWL would not be be able to express all of these 
constraints but that does not rule it of the picture.

I wouldn't make "ability to express the Data Cube integrity constraints" 
a requirement for the outcome of this WG (should it ever happen). As 
noted in earlier emails, communication of simple constraints (to both 
human and machine agents) is[2] the primary goal, ability to express 
arcane constraints is a minor "also ran".

Dave

[1] At least they confuse me every time I come back to them and I wrote 
them.

[2] "is" in the sense that that seemed to be the outcome of the workshop 
and accords with experience in Government linked data (e.g. as already 
expressed by Paul to this group).


On 06/08/14 10:27, Richard Cyganiak wrote:
> Holger,
>
> You're right. There wasn't really any alternative, given that currently 
there is no other W3C-recommended technology that can be used for RDF 
validation. SPARQL has turned out to be a reasonably good fit for this 
particular use case.
>
> The Data Cube experience has left me thinking that a W3C-recommended RDF 
validation technology should have SPARQL semantics, but that a more 
compact and idiomatic syntax may be beneficial.
>
> There was no attempt to write these constraints in closed-world OWL at 
the time, because that's not a W3C standard, but it would be interesting 
to see if/how the constraints could be expressed that way.
>
> Best,
> Richard
>
>
>> On 6 Aug 2014, at 08:23, Holger Knublauch <holger@topquadrant.com> 
wrote:
>>
>> To those who are critical of using SPARQL as a way of expressing 
constraints, or as a way of writing specifications, please take a look at
>>
>>     http://www.w3.org/TR/vocab-data-cube/#wf-rules
>>
>> The Data Cube Vocabulary is a W3C Recommendation that defines its 
constraints in SPARQL. This is instantly executable.
>>
>> HTH
>> Holger
>>
>>
>

Received on Wednesday, 6 August 2014 13:14:36 UTC