Re: W3C Data Cube spec uses SPARQL too

Actually, as I noted in my talk at the validation workshop, using SPARQL 
here was a mixed blessing.

On the plus side:

1. We were able to express the checks that we wanted to express.

2. Spec users are able to execute the checks using existing 
standards-compliant SPARQL engines (mostly, see below).

On the minus side:

a. Strictly SPARQL wasn't sufficient on its own without some surrounding 
control machinery. Rules IC-20 and IC-21 are SPARQL templates - you run 
one query then for each result from that query you instantiate the 
template, then run all the resulting template instances.

b. This is not a human readable way to communicate the constraints. Even 
the simple cases require thought to interpret, IC-12 and IC-17 are 
seriously confusing [1].

c. There are some performance issues. The no-duplicates check IC-12 is 
quadratic (in the number of observations) as expressed in SPARQL and 
problematic for big data sets. Actual cube validators (at least one the 
one the GLD WG put up) use custom code to implement this check near 
linearly, they don't use the SPARQL.


Closed world OWL would not be be able to express all of these 
constraints but that does not rule it of the picture.

I wouldn't make "ability to express the Data Cube integrity constraints" 
a requirement for the outcome of this WG (should it ever happen). As 
noted in earlier emails, communication of simple constraints (to both 
human and machine agents) is[2] the primary goal, ability to express 
arcane constraints is a minor "also ran".

Dave

[1] At least they confuse me every time I come back to them and I wrote 
them.

[2] "is" in the sense that that seemed to be the outcome of the workshop 
and accords with experience in Government linked data (e.g. as already 
expressed by Paul to this group).


On 06/08/14 10:27, Richard Cyganiak wrote:
> Holger,
>
> You're right. There wasn't really any alternative, given that currently there is no other W3C-recommended technology that can be used for RDF validation. SPARQL has turned out to be a reasonably good fit for this particular use case.
>
> The Data Cube experience has left me thinking that a W3C-recommended RDF validation technology should have SPARQL semantics, but that a more compact and idiomatic syntax may be beneficial.
>
> There was no attempt to write these constraints in closed-world OWL at the time, because that's not a W3C standard, but it would be interesting to see if/how the constraints could be expressed that way.
>
> Best,
> Richard
>
>
>> On 6 Aug 2014, at 08:23, Holger Knublauch <holger@topquadrant.com> wrote:
>>
>> To those who are critical of using SPARQL as a way of expressing constraints, or as a way of writing specifications, please take a look at
>>
>>     http://www.w3.org/TR/vocab-data-cube/#wf-rules
>>
>> The Data Cube Vocabulary is a W3C Recommendation that defines its constraints in SPARQL. This is instantly executable.
>>
>> HTH
>> Holger
>>
>>
>

Received on Wednesday, 6 August 2014 10:40:58 UTC