Re: ISSUE-140: Suggestion to close from Karen Coyle on 2016-11-01 (public-data-shapes-wg@w3.org from November 2016)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Mon, 31 Oct 2016 19:07:52 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <e0f419d7-cdb2-f9b7-99ee-1890caf7a948@kcoyle.net>
On 10/31/16 4:26 PM, Holger Knublauch wrote:
>
>
> On 31/10/2016 23:46, Dimitris Kontokostas wrote:
>> I also feel that the core of Eric's issue is solved but if we go with
>> this PR as is, we move from one edge to another.
>> However, I also agree that targetNode can be an antipattern that we do
>> not want to promote so much with the spec
>> (binding shapes with specific nodes in advance)
>
> Yes, the main use case of sh:targetNode seems to be in test cases,
> examples and in scenarios where shapes graph = data graph. I believe
> sh:targetClass will be (by far) the most commonly used target type, and
> I'd prefer to switch to that wherever possible. Most data instances also
> have rdf:type triples, so it will be natural for people coming from
> RDFS/OWL to understand this.

I disagree that "most data instances will have rdf:type triples" or at 
least if those triples exist then they will be useful. I've just munged 
around in some big datasets whose types are not ones that I would use 
for my purposes, nor would they be useful to me in validating. Because 
RDF allows data to be defined with more than one set of type 
declarations, types are the least predictable aspects of RDF data UNLESS 
you are using data that you controlled in input. My case is that the 
controlled data will be relatively easy to validate, but that my systems 
need to be able to process data from literally thousands of sources 
whose quality is not guaranteed. So leaning heavily on class 
declarations for non-enterprise systems does not seem to me to be 
feasible. I would like the document to give at least equal weight to 
solutions that make use of rdf:type and those that do not.

>
>> Maybe we could try a mix of different target schemes as well as some
>> examples without targets.
>> But we should somehow differentiate the target-based validation with
>> the targetless/ShEx validation.
>
> My opinion remains that the examples should be self-contained. We do
> label these as "shapes graph" and "data graph". It should be possible to
> understand what is being validated by looking at the shapes graph. If we
> take out the targets then people will ask us, well why are those nodes
> being validated only, why not for example the objects or predicates of
> the data graph.
>
> By deleting the targets we *only* explain the targetless case. By
> keeping the targets we explain *both* cases.

The cases with targets are well explained in the section on targets. 
What examples now in section 4 would cause people confusion? I think 
it's easy to see what validates. A few examples with both targets and 
constraints should clarify any questions.

kc

>
> Holger
>
>
>>
>> On Mon, Oct 31, 2016 at 2:41 AM, Holger Knublauch
>> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>>
>>
>>
>>     On 31/10/2016 17:54, Eric Prud'hommeaux wrote:
>>
>>         * Holger Knublauch <holger@topquadrant.com
>>         <mailto:holger@topquadrant.com>> [2016-10-31 09:29+1000]
>>
>>             Thanks for your work on the results tables, Eric. I have
>>             seen your pull
>>             request but I disagree with deleting the sh:targetXY
>>             triples from the
>>             examples. These need to be restored IMHO.
>>
>>         I think this gets to the heart of the issue. In earlier
>>         discussions,
>>         several of us said that dedicating a schema to a specific
>>         dataset is
>>         an antipattern. targetNode is particularly problematic in tha
>>         respect
>>         but even the rest of target* leave open questions. Most of your
>>         examples use targetClass which requires a specific type arc.
>>         If the
>>         data serves multiple purposes (e.g. an ex:SalesContact and an
>>         ex:User), you need discriminating type arcs for all the roles
>>         it may
>>         play.
>>
>>
>>     ISSUE-140 originally was about clarifying that *in addition to
>>     graph-based validation using targets* SHACL engines should support
>>     an interface to validate individual nodes by other means. Targets
>>     are part of SHACL. By leaving them out of the examples you may get
>>     closer to your (controversial) viewpoint, but it doesn't help to
>>     explain SHACL's graph-based mode of operation. The boxes are
>>     labeled "shapes graph" and "data graph", so it's fair to assume
>>     that these are meant to be consistently used as explained. We have
>>     various sections that explain how the targets are used. It's
>>     valuable to have consistency, and examples of targets have been
>>     requested multiple times.
>>
>>
>>         Is TopQuadrant's use case addressed by the target* section as it
>>         stands in my proposal?
>>
>>
>>     I don't think your branch has made changes to the target sections
>>     from the main branch? But yes, the current design addresses the
>>     use cases that we have for SHACL.
>>
>>     Holger
>>
>>
>>
>>
>>
>>             (See https://github.com/w3c/data-shapes/pull/22/files
>>             <https://github.com/w3c/data-shapes/pull/22/files>)
>>
>>             Holger
>>
>>
>>             On 26/10/2016 22:01, Eric Prud'hommeaux wrote:
>>
>>                 * Holger Knublauch <holger@topquadrant.com
>>                 <mailto:holger@topquadrant.com>> [2016-10-07 10:59+1000]
>>
>>                     We are down to 14 open issues right now, and I am
>>                     keen on making further
>>                     progress. My take is the sooner we have the formal
>>                     list of open issues down,
>>                     the earlier we can focus on the informal issues
>>                     raised from the outside.
>>
>>                     ISSUE-140 was last discussed
>>
>>                     https://www.w3.org/2016/09/27-shapes-minutes.html#item08
>>                     <https://www.w3.org/2016/09/27-shapes-minutes.html#item08>
>>
>>                     but I have to confess I did not quite understand
>>                     what problem Eric was
>>                     referring to. It seems that Eric was merely
>>                     pointing out that validation can
>>                     be defined independently from specific node
>>                     selection (i.e. target)
>>                     mechanisms. I of course agree with that. Could you
>>                     clarify?
>>
>>                 I've forked the spec and gone through about half of
>>                 the examples (up
>>                 to sh:and) and added tabular summaries:
>>
>>                 https://ericprud.github.io/data-shapes/shacl/
>>                 <https://ericprud.github.io/data-shapes/shacl/>
>>
>>                 I believe this helps readers and addresses this issue.
>>
>>
>>                     Ted seemed to request some more detail in the spec
>>                     about how the validation
>>                     of individual nodes is supposed to happen. We
>>                     already have one such
>>                     interface, the sh:hasShape function, which can be
>>                     invoked to trigger the
>>                     validation of a given node against a given shape.
>>                     We have no such interface
>>                     for the case in which only a node is given. But we
>>                     also don't formally
>>                     define how the validation is triggered in the
>>                     general, whole-graph case. We
>>                     could potentially add a function
>>                     sh:validateNode(?node) that validates the
>>                     given node against all shapes with matching
>>                     targets. But then people will
>>                     likely complain that we are adding yet another
>>                     SPARQL implementation
>>                     requirement. Alternatively, Ted, could you clarify
>>                     how else we can meet your
>>                     requirement?
>>
>>                     Thanks,
>>                     Holger
>>
>>
>>
>>
>>                     On 23/09/2016 10:11, Holger Knublauch wrote:
>>
>>                         I had raised
>>                         https://www.w3.org/2014/data-shapes/track/issues/140
>>                         <https://www.w3.org/2014/data-shapes/track/issues/140>
>>                         myself,
>>                         primarily as a reminder that validation of
>>                         individual nodes should be
>>                         mentioned in the spec. I have meanwhile added
>>                         a sentence which IMHO
>>                         addresses this need.
>>
>>                         PROPOSAL: Close ISSUE-140 as addressed by
>>                         https://github.com/w3c/data-shapes/commit/2046305962be7cd47400e7a2b51cd2841dca398c
>>                         <https://github.com/w3c/data-shapes/commit/2046305962be7cd47400e7a2b51cd2841dca398c>
>>
>>                         Holger
>>
>>
>>
>>
>>
>>
>>
>> --
>> Dimitris Kontokostas
>> Department of Computer Science, University of Leipzig & DBpedia
>> Association
>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>> http://aligned-project.eu
>> Homepage: http://aksw.org/DimitrisKontokostas
>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Tuesday, 1 November 2016 02:08:27 UTC