Re: ISSUE-95: Template Simplifications from Holger Knublauch on 2015-10-29 (public-data-shapes-wg@w3.org from October 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 30 Oct 2015 09:13:39 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <5632A823.3090904@topquadrant.com>
On 10/30/2015 5:30, Peter F. Patel-Schneider wrote:
> On 10/28/2015 10:29 PM, Holger Knublauch wrote:
>> On 10/29/2015 14:14, Peter F. Patel-Schneider wrote:
> [...]
>>> 8.1
>>>
>>> The SPARQL queries linked to a scope via sh:sparql must be of the query form
>>> SELECT, or a fragment that produces a valid SELECT query if wrapped by
>>> SELECT ?this WHERE { ... }. The SELECT queries must project to the result
>>> variable ?this.
>>> The SELECT queries must also be executable when converted to an ASK query
>>> and with a pre-bound value for ?this. The set of bindings for ?this that
>>> return true for such ASK queries must be identical to the set produced by
>>> the SELECT query. This constraint makes sure that engines can validate
>>> whether a given shape applies to a given focus node as part of the
>>> validateNode operation.
>>> ->
>>> The SPARQL queries linked to a scope via sh:sparql must be of the query form
>>> SELECT ?this WHERE { ... }.
>> Ok, I could live without allowing the fragments, for simplification purposes.
>>
>> The reason for the second paragraph (on the pre-bound variable for ?this) is
>> the validation of individual nodes. For example, when someone has a shape with
>> a custom scope and you have ex:MyInstance, then the algorithm to determine
>> whether the shape applies to the instance can be much more efficient than
>> having to evaluate the whole scope and check whether the result set contains
>> ex:MyInstance. The latter would become prohibitively slow for large databases.
>>
>> Do you have examples of scopes where that restriction would be an obstacle?
>> The (few) examples of custom scopes that I have seen were easily convertible
>> into ASK queries without changing the WHERE clause.
>>
>> An alternative design to dropping this bidirectionalism would be to have an
>> optional second property sh:inverseSPARQL that can be put to a scope for those
>> cases where the original scope query cannot be converted to ASK. I would be OK
>> with that.
> It seems to me that the slowness is only a real problem when looping through
> the values for included constraints.  However then it is only a problem
> because of a particular implementation.  There are other ways to do the
> control of included constraints that would not require conversion to ASK and
> running the query multiple times.

I am not sure whether we are talking about the same thing. Let's take an 
example. Scope of ex:MyShape includes all instances ex:bornIn ex:USA:

ex:MyShape
     sh:scope [
         sh:sparql """
                 SELECT ?this
                 WHERE {
                     ?this ex:bornIn ex:USA .
                 }
             """
     ]

Now, assume you have a given instance ex:John and want to validate all 
constraints relevant to him. In order to do that, we would need to 
completely evaluate the SELECT query and then check whether 
?this=ex:John somewhere. Only then ex:MyShape applies. If you have 
millions of instances bornIn the USA, then this is not workable. What I 
propose is that the query should be evaluable so that it essentially becomes

ASK {
     ex:John ex:bornIn ex:USA .
}

which is a much faster SPARQL query. There are potential cases where 
this conversion is not simple, but I would like to see real-world 
examples of them that would break this convention.

Holger

>
> I don't know whether the conversion is going to be all that difficult.
> However, SPARQL is a complex language and I think that requiring even partial
> SPARQL processing is something that should not be required if not absolutely
> necessary.
>
> peter
Received on Thursday, 29 October 2015 23:14:18 UTC