Re: ISSUE-105: Proposal based on sh:prefix

sh:prefix seems like a reasonable compromise, but I have some concerns 
with this approach. First its commingling semantic language content with 
syntax. sh:prefix introduces a SHACL RDF assertion that is typically 
handled syntactically, and with different representations in serialization 
formats. These seem like very different concerns.

Second, sh:prefix is biased towards extension languages that use XML style 
namespaces and prefixes. There are other approaches to namespace 
management in other languages that might be candidates for a SHACL 
extension mechanism, depending especially on implementations.  These 
extension languages could use other, more semantically meaningful 
approaches like packages. Having SHACL do anything with these string 
literals seems dangerous and limiting.

Third, its unclear what scope of the sh:prefix declaration might be. 
Although it might often be the case that there are a number of prefixes 
that apply to all string literals in a SHACL resource that represents 
SPARQL queries, they may not apply to all. Each query could need its own 
special prefixes based on the domains being queried. So this could lead to 
overrides, and the need to put prefix declarations in multiple places 
anyway.

Finally I don't think SHACL should be too concerned about optimizing hand 
editing of a specific syntax such as Turtle. 

I don't know if these issues/risks are sufficient to motivate removing 
sh:prefix, but my conservative approach to design has always been when in 
doubt, leave it out, especially if there's a straightforward solution that 
is specific to the embedded literal syntax and independent of the rest of 
SHACL.




Jim Amsden, Senior Technical Staff Member
OSLC and Linked Lifecycle Data
919-525-6575




From:   Holger Knublauch <holger@topquadrant.com>
To:     "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Date:   04/21/2016 11:32 PM
Subject:        ISSUE-105: Proposal based on sh:prefix



As discussed today, I have worked on a proposal to use sh:prefix as a 
compromise between the various view points. It can be found in the 
beginning of section 6.1:

http://w3c.github.io/data-shapes/shacl/#sparql-constraints-syntax

(Please don't interpret this as an attempt to smuggle something into the 
spec :) I have put this into the draft with the intention of making a 
readable proposal only. The section that I wrote has merely replaced an 
even more controversial passage based on the prefixes in the shapes 
graph. It is also clearly marked as unfinished.)

Reflecting on the discussion today, I was really surprised by the broad 
pushback against this sh:prefix property. From a practical viewpoint, it 
appears clear to me that people would not like to have to be forced to 
either spell out the whole prefix each time

         sh:sparql """
             SELECT $this ($this AS ?subject) 
(<http://example.com/ns#germanLabel> AS ?predicate) (?value AS ?object)
             WHERE {
                 $this <http://example.com/ns#germanLabel> ?value .
                 FILTER (!isLiteral(?value) || 
!langMatches(lang(?value), "de"))
             }
             """ ;

or to repeat the same PREFIX declaration in each sh:sparql:

         sh:sparql """
             PREFIX ex: <http://example.com/ns#>
             SELECT $this ($this AS ?subject) (ex:germanLabel AS 
?predicate) (?value AS ?object)
             WHERE {
                 $this ex:germanLabel ?value .
                 FILTER (!isLiteral(?value) || 
!langMatches(lang(?value), "de"))
             }
             """ ;

Having to repeat or spell out does not only bloat the documents, but 
also makes them more error prone and harder to maintain. Sure, visual 
tools could generate them, but this would not help those hand-editing 
Turtle files.

One counter argument today was that this would open the door for 
conflicts because of prefix clashes. Sure, this is similar to the 
existing situation with Turtle and other formats. But
1) These potential conflicts are easy to detect and would produce an 
invalid shapes graph
2) Shape graph authors can always shield themselves from conflicting 
sh:prefix statements by putting PREFIXes into their query strings.
3) In my ten years at TQ I have barely ever seen such conflicts.

The other argument I remember was that having fully parsable sh:sparql 
strings in the file would simplify copy and paste for testing. Again, 
nobody is forced to use sh:prefix triples so you are free to follow the 
coding convention and workflow of your choice.

The other option, not doing anything will almost certainly lead to a 
situation in which some tools will just use the prefixes from Turtle 
files and others won't, creating incompatible files.

I believe sh:prefix has little costs and is IMHO crucial for getting 
SHACL's SPARQL extension mechanism adopted. Given that we are exploring 
something new here (SPIN didn't use sh:sparql but a completely different 
RDF syntax for SPARQL), I propose to leave it in the spec for now and 
let user feedback decide. As a further step towards an acceptable 
compromise, I have added a statement that it is recommended to only use 
sh:prefix in closed and controlled environments or for well-established 
prefixes.

Thanks,
Holger

Received on Friday, 22 April 2016 13:49:44 UTC