Re: shapes-ISSUE-30 (shape-and-data-graphs): Are shapes and data in the same graph? [SHACL Spec] from Holger Knublauch on 2015-03-29 (public-data-shapes-wg@w3.org from March 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Sun, 29 Mar 2015 17:19:20 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <5517A778.5020701@topquadrant.com>

On 3/29/15 8:17 AM, Peter F. Patel-Schneider wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 03/28/2015 02:59 PM, Richard Cyganiak wrote:
>> Peter,
>>
>>> On 28 Mar 2015, at 20:31, Peter F. Patel-Schneider
>>> <pfpschneider@gmail.com> wrote:
>>>
>> I am uncertain as to what access to the shape graph when validating the
>> data graph is supposed to mean.
>>
>> On one hand, it seems to me that if the shape graph is inaccessible then
>> there is no way that the shapes can be acccessed and so no way to
>> validate the shapes, so it appears to me that all proposals need this.
>> On the other hand, I am unaware that any proposal needs the shape graph
>> to be part of the data graph.
>>
>>> The treatment of sh:allowedValues in the SHACL draft [1][2] requires
>>> that shapes and data be in the same graph, as it relies on querying the
>>> sh:member triples.
>>> An alternative would be to pass the allowed values in as pre-bound
>>> variables, as done for other constructs. But there is the complication
>>> that it’s more than one.
> I think that a treatment that put the allowed values in the SPARQL query
> would be better.

Generating something like a SPARQL IN filter would only perform well for 
small enumerations. I also chose a sh:Set instead of rdf:List to get 
set-based lookup performance. Finally, pointing to a sh:Set simplifies 
sharing of enumerations, and adding new values to existing sets. Imagine 
large Reference Data code lists.

Another aspect is that having to produce and parse new SPARQL strings 
repeatedly is very slow. Pre-binding a variable in an already parsed 
SPARQL query object is usually much faster.

In my current design, templates can set a flag to indicate whether they 
also need to see the graph containing the constraint definitions in 
their WHERE clause. This means that most operations can be very fast, 
and only access the query graph. Other operations may need to see the 
constraints graph too, and these may be slower.

Holger

Received on Sunday, 29 March 2015 07:19:53 UTC