- From: Holger Knublauch <holger@topquadrant.com>
- Date: Fri, 10 Apr 2015 16:45:21 +1000
- To: public-data-shapes-wg <public-data-shapes-wg@w3.org>
- Message-ID: <55277181.9070907@topquadrant.com>
On 4/10/15 4:35 PM, Dimitris Kontokostas wrote:
>
>
> On Fri, Apr 10, 2015 at 8:19 AM, Holger Knublauch
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
> On 4/10/2015 15:12, Dimitris Kontokostas wrote:
>
>
> I think you are referring to sh:valueShape and the
> sh:hasShape(?shape) function right? I don't see any other case
> that could be problematic.
>
>
> Also sh:OrConstraint (or any similar template that we or users may
> want to add, such as negation and intersection).
>
>
> Why can't we move these into the validation engine? e.g. (SPARQL Q1)
> or/xor/... (SPARQL Q2)
>
> And sh:allowedValues (which take a list or set of values, and
> those must reside somewhere, I guess they should reside with the
> shapes) - more general any template that takes rdf:List arguments
> that need to be walked at runtime.
>
>
> These should indeed reside in the shapes graph(s). Implementations
> could either pre-build the queries or build them at run-time.
> When we are working on immutable datasets (i.e. endpoints)
> pre-building the values in the queries would be the only option.
> Implementations with other use cases could optimize this.
Yes, we (as spec writers) can hard-code all these things. And we can
insert all kinds of flags and options on how the generated SPARQL
queries would work. But can we do better than this, and try to stay
generic so that end users have the same power? If we make the
assumptions that sh:hasShape and GRAPH sh:ShapesGraph {... } exist then
these issues are solved generically. We would need to compare the
relative costs of these approaches.
>
>
> In this case, I was waiting for some clear definition for
> recursion in order to make a proposal but I think we have many
> options to go with.
> For example: If the data and the constraints are in the same
> graph we can use the sh:hasShape() function you propose,
> otherwise use algorithm X to execute the ShEx validation in
> multiple steps or Algorithm Y to convert the ShEx shape into a
> (giant) SPARQL query similar to the ShEx 2 SPARQL [1].
>
>
> I don't think we should limit ourselves to the hard-coded
> built-ins of "ShEx" here - this should work with any user-defined
> template/macro too.
>
> If recursion is forbidden, things get much simpler and maybe -
> I need to work on this first to say for sure - ShEx shapes
> could be just treated as class shapes with an extra SPARQL filter.
>
> We need to have a clear definition of the ShEx shapes to see
> our options and we shouldn't limit the language design in advance.
>
> Proposed resolution:Shapes and data are expected to exist in
> different graphs unless specified specified otherwise
>
>
> Agreed. In some cases the graph called the shapes graph could be
> identical with the data graph though - it would just be accessed
> via a magic named graph name or GRAPH ?variable.
>
>
> Indeed, the user could specify that they are identical in many cases
> and implementations can optimize execution in these cases,
> But I think 'GRAPH ?variable' is an implementation detail, the spec
> should assume that the data graph cannot access the shapes graph - or
> provide alternative(s)
The engine that dispatches all these SPARQL queries could maybe be smart
enough to detect GRAPH sh:ShapesGraph { ... } and then somehow split the
original query into two. Or templates that require access to the shapes
graph could consist of two SPARQL queries to begin with, and pass the
results from the shapes query into the second "data" query. There are
all kinds of work-arounds like this, but I have not worked on such
things yet. Maybe we could create a solution that solves sh:shape,
sh:OrConstraint and sh:allowedValues and then assume that other problems
are of the same kind.
Holger
Received on Friday, 10 April 2015 06:45:55 UTC