Re: Pragmatic Proposal for the Structure of the SHACL Spec from Arthur Ryman on 2015-03-19 (public-data-shapes-wg@w3.org from March 2015)

From: Arthur Ryman <arthur.ryman@gmail.com>
Date: Thu, 19 Mar 2015 12:05:06 -0400
To: Holger Knublauch <holger@topquadrant.com>
Cc: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
Message-ID: <CAApBiOkAn41rfgUoX0nmwo3-KGCGgb3pEKvigvN3aQPNACV=HQ@mail.gmail.com>
Holger,

I've read the March 19 version. Let me answer your second question first.

You asked:

>> how could anyone define Part 2 (templates and SPARQL) without also covering Part 3 (Which is supposed to be Section 11 "Supported Operations")?

 I am proposing that we divide the content based on the target audience.

Part 2 is for people who want to write templates ( and functions and
whatever else we allow to be added) by writing code in some execution
language such as SPARQL or Javascript. The language in Part 2 should
be natural language that is precise enough so that template authors
can write templates. We should define the language binding for SPARQL
and give normative SPARQL definitions of the constraints introduced in
Part 1.

Part 3 is for people who want to implement SHACL engines. The language
in Part 3 should be in some precise formalism that allows us to
specify the control behavior. This is independent of SPARQL.

You asked:

>> Where does it not follow the division that you propose? What would you do differently?

First let me say that your  organization is close to what I am
proposing. Sections 1-6 are mainly Part 1 content. Sections 7-10,
12,13,and A-C are Part 2. Section 11 is Part 3, although it needs to
be formalized more precisely.

However, there is some intermingling of content. I'd characterize your
approach as saying that SHACL is a template engine and we've defined a
set of core templates. However, the fact that the SHACL core
constraints are implemented as SPARQL templates is an implementation
detail from that point of view of the Part 1 audience. I'd therefore
remove mention of templates and SPARQL from Part 1. Here is a list of
specific changes:

1.1 Overview and Terminology

- Defer discussions of templates to Part 2. The following text is an
implementation detail wrt Part 1 readers:
"The validation of each constraint is formalized with one or more
execution languages. This version of SHACL supports SPARQL as an
execution language, but other languages may be supported in future
versions or by other communities. Each constraint needs to be backed
by at least one executable body in SPARQL, and any alternative bodies
need to follow the same semantics as the SPARQL queries. Constraints
may either directly define such an executable body or point to a
template. Constraints that directly include an executable body are
called native constraints. A template serves as a parameterizable
macro that wraps an executable body. Constraints that rely on a
template are called template constraints. The SHACL vocabulary
includes a small library of such templates for common constraint
patterns, but anyone can add their own template libraries. Templates
can be grouped into so-called profiles. Some SHACL engines may decide
to only support certain profiles and implement them differently than
the provided (SPARQL) bodies. "

- Defer the following to Part 3:
"One of the operations that SHACL engines should support verifies that
a given RDF node matches a given shape. This operation can be invoked
based on any control logic, i.e. applications can pick their own
mapping between RDF nodes and their shapes. SHACL also provides two
mapping mechanisms based on the RDF triples in the graph being
validated. Current proposals for these mechanisms include selection
based on sh:nodeShape and rdf:type triples. Based on such in-graph
mappings, SHACL supports constraint validation over a complete graph.
"

3.3 Shape Constraints

- Defer the entirety of section 3.3 to Part 2.

4.1 Property Constraints

- Defer the following to Part 2:
"Technically, sh:PropertyConstraint is also a Template (introduced in
a later section). However, for the purpose of this section, it
suffices to understand that each property constraint can have one or
more facet properties such as sh:minCount that have a pre-defined
meaning to a SHACL processor. "

- Remove the "Template" column from both tables.

4.1.4 sh:nodeType

- Remove the "SPARQL Expression" column from the table.

-- Arthur


On Wed, Mar 18, 2015 at 11:46 PM, Holger Knublauch
<holger@topquadrant.com> wrote:
> Arthur,
>
> did you have a chance to skim through the current spec draft? Where does it
> not follow the division that you propose? What would you do differently?
>
> You also did not address my question: how could anyone define Part 2
> (templates and SPARQL) without also covering Part 3 (Which is supposed to be
> Section 11 "Supported Operations")?
>
> Thanks,
> Holger
>
>
>
> On 3/19/2015 12:54, Arthur Ryman wrote:
>>
>> Holger,
>>
>> I understand that this division of content is not what you have been
>> advocating. I disagree that specs are only for implementers. I
>> frequently consult the SPARQL spec when I am writing queries, but I
>> have never implemented a SPARQL engine. Who among has implemented a
>> SPARQL engine? Specs are normative and precise. Primers focus on
>> progressive disclosure and are pedagogical.
>>
>> Part 1 is not a primer. It covers the part of SHACL that most users of
>> SHACL need to understand (approximately the same level of material
>> covered by OSLC Shapes). That part of SHACL can be used without
>> requiring any knowledge of SPARQL. That part of SHACL could be
>> implemented in other technologies.
>>
>> Part 2 defines the extension mechanism and uses SPARQL to define the
>> precise meaning of constraints expressible in Part 1. Only people who
>> understand SPARQL would benefit from reading it.
>>
>> Part 3 defines the semantics more abstractly and precisely defines the
>> aspects of Part 2 that cannot be expressed directly in SPARQL, e.g.
>> the higher level control structures. In theory, an implementer who did
>> not understand SPARQL would be able to understand SHACL by reading
>> Part 3.
>>
>> We definitely need a primer for Part 1. We probably need a primer for
>> some of Part 2. The audience for Part 3 is beyond the need for
>> primers.
>>
>> -- Arthur
>>
>> On Wed, Mar 18, 2015 at 6:43 PM, Holger Knublauch
>> <holger@topquadrant.com> wrote:
>>>
>>> Hi Arthur,
>>>
>>> I think your message is mixing up Primer and Spec material. The Spec is
>>> meant for implementers. Others can read Primers, examples, mailing lists,
>>> stack overflow etc. For now the WG should focus on the Spec.
>>>
>>> Having said this, I believe my current draft addresses your separation of
>>> Part 1 (sections 2-6) and Part 2 (sections 7 onwards) already. I do not
>>> see
>>> how to separate your Part 3 from Part 2 - as soon as you talk about
>>> templates and sh:sparql, you also need to have a precise definition of
>>> how
>>> these are implemented. This is covered by the Operations section of my
>>> draft.
>>>
>>> Holger
>>>
>>>
>>>
>>> On 3/19/2015 8:29, Arthur Ryman wrote:
>>>>
>>>> At present we are witnessing a burst of creative activity. It is great
>>>> to see such energy in a WG. However, the result is that we have too
>>>> many specs and I doubt that most members can give all these documents
>>>> adequate review. We need to focus our resources on fewer specs.
>>>>
>>>> There has also been extended discussion on the role of SPARQL, on
>>>> profiles of SHACL, and on who the target audience is. I'd like to
>>>> propose a pragmatic structure that will enable us to package the
>>>> material in a way that will address our audiences, enable us to divide
>>>> the work better, and create a sequence of deliverables.
>>>>
>>>> I propose three levels of content. I'll refer to these as Parts 1, 2,
>>>> and 3 in order to defer word-smithing of the titles.
>>>>
>>>> 1. SHACL Part 1. The audience is people who want to use a simple,
>>>> high-level language that can express common constraints. This document
>>>> should use precise, natural language to describe the semantics of
>>>> those common constraints that can be expressed using the high-level
>>>> vocabulary of SHACL. It should also include simple examples to
>>>> illustrate concepts. It should be readable by people who do not know
>>>> SPARQL. It should not refer to SPARQL. It should not define formal
>>>> semantics. It should be possible for this part of SHACL to be readily
>>>> implemented in SPARQL or Javascript. We therefore need to limit the
>>>> expressive power of this part of SHACL.
>>>>
>>>> 2. SHACL Part 2. The audience is people who want to write custom
>>>> constraints using an executable language. This part defines the
>>>> template/macro mechanism. It also provides normative SPARQL
>>>> implementations of the high-level SHACL language introduced in Part 1.
>>>> This part should not contain other formal specifications. The SPARQL
>>>> implementations can be regarded as executable formal specifications.
>>>>
>>>> 3. SHACL Part 3. The audience is people who want to implement SHACL.
>>>> This part should contain a formal specification. We can defer the
>>>> choice of formalism. If we have multiple candidates and willing
>>>> authors, let's do a bake-off.
>>>>
>>>> -- Arthur
>>>>
>>>
>
>
Received on Thursday, 19 March 2015 16:05:34 UTC