Re: Selected problems with Proposal 4 from Peter F. Patel-Schneider on 2016-04-14 (public-data-shapes-wg@w3.org from April 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Thu, 14 Apr 2016 16:57:29 -0700
To: Holger Knublauch <holger@topquadrant.com>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Message-ID: <57102E69.3070403@gmail.com>
Some more responses:

> 7) Paths can already be handled (in a very controlled form) using
> sh:valueShape and derived values.

Some aspects of paths can indeed be handled using nested property
constraints.  Some other aspects of paths can also be handled using the
derived values extension mechanism.  However, many simple things that one
might want to do cannot.  For example, as far as I can tell counting up
grandchildren is not possible without paths.

> 9) Path expressions cause a lot of new complexity, computationally,
> syntactically, for SPARQL generation etc.

My implementation of paths amounts to very little more than the following two
functions:

def parttoSPARQL(g,part) :
    result =  ("^"+g.value(part,SH.inverse).n3()) \
              if (part,SH.inverse,None) in g else part.n3()
    return result

def pathtoSPARQL(g,value) :
    if value == RDF.nil :
        print "EMPTY PATH"
        return ""
    if (value,RDF.rest,None) in g :
        path = [ parttoSPARQL(g,part) for part in listElements(g,value) ]
        return Literal("/".join(path))
    elif (value,SH.inverse,None) in g :
        return Literal(parttoSPARQL(g,value))
    else : return value

> 10) Path expressions make static analysis (for things like form generation and
> structural checking of a shapes model) almost impossible. If an arbitrary path
> can show up where we previously only had simple predicates, then a lot of
> extra checking and branching needs to happen to make sense of the situation.

For validation of shape graphs, I have a metamodel that validates
shapes graphs.  The path part of this metamodel is

shmm:pathShape a sh:Shape ;
  sh:or ( shmm:pathPartShape
          [ a sh:Shape ; sh:list shmm:pathPartShape ] ) .
shmm:pathPartShape a sh:Shape ;
 sh:or ( shmm:inverseShape         # inverse of a property
         [ a sh:Shape ; sh:nodeKind sh:IRI ] ) . # property
shmm:inverseShape a sh:Shape ;
 sh:propValues ( sh:inverse [ a sh:Shape ; sh:nodeKind sh:IRI ;
                  sh:minCount 1 ; sh:maxCount 1 ] ) .

For form generation, a minimal implementation would involve checking to see
if the path was a single property or inverse and only doing something then.
This would involve only a tiny amount of code.  To handle paths in forms
does require more work, of course, but I don't see that it can be much more
work.

> 12) Some constraint types require different SPARQL queries (or JavaScript or
> whatever) depending on the direction of a property (or even worse, for an
> arbitrary path). For example sh:minCount needs to count subjects versus
> objects. Proposal 4 does not even talk about this and no example of SPARQL
> generation is given. Not all constraint types are of the simple allValuesFrom
> pattern implemented by NodeValidationFunctions.

In this proposal components do not behave differently depending on whether
they are in the analogue of a node constraint, a property constraint, or an
inverse property constraint.  A simple examination of the informal semantics
is sufficient to see this.

An implementation that works via a translation to SPARQL does need to have
the path in the query generated for a component.  My implementations handle
this by making the path BGP (and some other stuff) available during query
generation.  Simple components (like sh:classIn, for example) can just
produce a filter or a pattern that is combined with the path BGP in a
standard way.  Other components (like sh:minCount) do need to do something
more complex, but all that they need to do is to put the path BGP, whatever
it is, in a particular spot in the query that they generate.

> 15) With single-parameter constraint types, and the need to use reified
> objects or list parameters whenever you need to pass in multiple values
> instead, the labeltemplate and sh:message templates become useless as there is
> no general mechanism to access the nested parameter values. They just become
> random objects and lists.

When the parameter to a component is an arbitrary complex object it can be
picked apart using paths. For example, my template implementation uses

  sh:propValues ( ( rdf:rest rdf:first ) [ a sh:Shape ; sh:argumentName
"shape" ] ) ;

to access the shape of a list-based sh:propValues component.

If the parameter is just a node with properties then the current mechanism can
be used.  Instead of getting the property value in the constraint get the
property value in the parameter.

> 16) If multiple parameters are needed, the problem of defining and using them
> is just shifted by one level. For example, proposal 3 has a uniform and
> integrated syntax to define parameters. If you just point at an object then
> you need to talk (elsewhere) about the constraints on those objects. This is
> inconsistent, verbose, unmaintainable and not user friendly at all.

If the parameter is more than just a single node then there can be a need to
access pieces of it analogous to the need to access values of different
properties in a constraint.  The uniform way to do this is the same as shown
above.

Here is the portion of my template implementation of list-based
sh:propValues that defines the syntax of sh:propValues and extracts the
pieces of it.  This is the most syntactically complex core component of them
all, I believe.

sh:propValues
  sh:list [ a sh:Shape ] ;
  sh:propValues ( rdf:first [ a sh:Shape ; sh:shape shmm:pathShape ;
sh:argumentName "path" ] ) ;
  sh:propValues ( ( rdf:rest rdf:first ) [ a sh:Shape ; sh:class sh:Shape ;
sh:argumentName "shape" ] ) ;
  sh:propValues ( ( rdf:rest rdf:rest ) [ a sh:Shape ; sh:hasValue rdf:nil ] ) ;

> 19) If you need parameter objects to pass in multiple values, every SPARQL
> implementation of such a constraint type will first need to start with a block
> to retrieve all the real parameters that are nested in the object or list.
> Compare:
>
> WHERE {
>     GRAPH $shapesGraph {
>         $myParam ex:value1 ?value1 .
>         OPTIONAL {
>             $myParam ex:value2 ?value2 .
>         }
>     }
>     $this $predicate ?object .
>     FILTER (doSomething(?object, ?value1) || (bound(?value2) &&
> soSomethingElse(?object, ?value2))
> }
>
> versus the current syntax:
>
> WHERE {
>     $this $predicate ?object .
>     FILTER (doSomething(?object, $value1) || (bound(?value2) &&
> soSomethingElse(?object, $value2))
> }

All that is needed is a way to access pieces of the parameter at query
generation time.  If the parameter is a node with property values then the
mechanisms currently in SHACL, retargetted to the parameter, will suffice.
If the parameter is a different kind of structure the mechanisms that I have
described above are adequate.  Here is the complete generation template for
sh:disjoint taking a list of two properties

  sh:propValues (rdf:first [ a sh:Shape ; sh:argumentName "property1" ] ) ;
  sh:propValues ((rdf:rest rdf:first) [ a sh:Shape ; sh:argumentName
"property2"]);
  sh:templatePattern """?this [property1] ?value1 . ?this [property2] ?value1
.""" .

> 21) Proposal 4 separates the "shape" of a constraint type from its actual
> definition. This is verbose and harder to maintain. Proposal 3 handles this
> much more elegantly, where the constraint type itself doubles as a shape, and
> sh:parameter is basically a property constraint (pending the choice of various
> options). No need for separate shapes.

A template constraints can have very simple definitions.  Here is the
entirety of the template implementation for sh:directType, including a shape
(shmm:directTypeShape) that checks whether uses of sh:directType have the
correct syntax.  The template property here plays multiple roles - it is the
property that is used in shapes, it is the shape that is defines what a
correct use of the component looks like, it has the SPARQL code, and it has
information to generate messages, etc.

sh:directType a rdf:Property ; a sh:Shape ; rdfs:domain sh:Shape ; rdfs:range
rdfs:Class ;
  sh:nodeKind sh:IRI .
shmm:directTypeShape a sh:Shape ; sh:scopeClass sh:Shape ;
  sh:propValues ( sh:directType sh:directType ) .
sh:directType a sh:ComponentTemplate ;
  sh:templateMessage "Does not have required direct type [argument]"@en ;
  sh:message "Classes need to be IRIs"@en ;
  sh:templateFilter """EXISTS { ?this rdf:type [argument] }""" .
Received on Thursday, 14 April 2016 23:57:59 UTC