Re: Implementation feasibility (was: Re: Pragmatic Proposal for the Structure of the SHACL Spec) from Jose Emilio Labra Gayo on 2015-03-20 (public-data-shapes-wg@w3.org from March 2015)

From: Jose Emilio Labra Gayo <jelabra@gmail.com>
Date: Fri, 20 Mar 2015 18:48:03 +0100
To: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Cc: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CAJadXXKuL70GMOCQBpBGRVSjEeHZjgLN-BvfLMCvPG9g0uroWw@mail.gmail.com>
On Fri, Mar 20, 2015 at 5:54 PM, Dimitris Kontokostas <
kontokostas@informatik.uni-leipzig.de> wrote:

>
> On Mar 20, 2015 10:13 AM, "Dimitris Kontokostas" <
> kontokostas@informatik.uni-leipzig.de> wrote:
> >
> >
> >
> > On Thu, Mar 19, 2015 at 10:46 PM, Holger Knublauch <
> holger@topquadrant.com> wrote:
> >>
> >> On 3/20/15 2:52 AM, Arnaud Le Hors wrote:
> >>>
> >>> I have to differ on the point that the organization of our documents
> is up to the editors. This is very much a WG decision and there is actually
> very practical consequences that we need to consider and may influence how
> we want to organize our documents.
> >>>
> >>> One of them is that to become a Recommendation a specification needs
> to successfully go through Candidate Recommendation and gather enough
> implementation support (what exactly "enough" means is to be defined by the
> WG but it's a least 2 implementations). If everything is in one document,
> we need to get that  whole document to get enough implementation support to
> become a Recommendation. If we split the document in smaller pieces,
> various pieces might move along at a different pace.
> >>>
> >>> For instance, we might be able to get the higher level set that
> doesn't require support for SPARQL extensions as a REC sooner.
> >>>
> >>> This is also something to consider when thinking about the
> dependencies between the different layers. With the bottom up approach, if
> we end up with the higher level having a normative reference to the lower
> level it won't be able to get to REC until the lower level is. This is also
> true if the higher level is defined as a "profile" (i.e., a specification
> that defines a subset of a full spec) of the main spec.
> >>
> >>
> >> The argument about implementation support is not strong enough. As
> stated TopQuadrant will provide an open source implementation of the full
> spec that is tracking the progress as soon as we decide to go to FPWD
> state. In the IRC log, Dimitris has also announced that he will work on
> another implementation. Finally, while I cannot speak for them, we have at
> least one SPARQL database vendor in our group (I am not sure what happened
> to Arthur Keen, I don't see him on the list anymore). This sounds like at
> least 3 implementations from within the group already, and we still have
> 1.5 years to go.
>

Holger, you only mention SPARQL based implementations...this contradicts
the assertion that it will be possible to have non-sparql based
implementations.

At this moment, there are already some implementations that show that
non-SPARQL based implementations of the core language are feasible.

> >> Speaking personally and strictly in terms of implementation
> feasibility, I think I could easier implement the currently 'SPARQL
> extension mechanism' than the currently suggested 'core' language.
>

It depends on which parts you are planning to implement and which parts you
are planning to leverage on. If you are planning to implement something on
top of SPARQL as SPIN, then it is clearly more feasible to follow the same
road and implement another SPIN implementation.

But what we are proposing is not to limit the implementations to be defined
on top of SPARQL and to allow other implementations that can be done, for
example, in plain javascript. In that case, maybe, you would consider it
much more complicated to have to implement the full SPARQL engine just to
define the core language.

That's why I advocate to have a core language rich enough to define most of
the validation use cases and have an optional extensibility mechanism
similar to ShEx semantic actions that calls a external processor (let it be
SPARQL, Javascript, or whatever can appear in the future). The benefits are
that the shapes defined in that high level language will be compatible
between different implementations.

If the users want the extra functionality that provides SPARQL they can
still use it, but they will know that in that case their shapes depend on
those external processors.

> >
> > The current spec defines templates and function and RDFUnit already has
> a templating mechanism. It is different and more low-level that the one
> Holger suggests but I could easily implement it, which leaves only
> functions which is kind of similar in complexity.
> >
> > On the other hand, the core language has features that are harder to
> implement (depending on the final details of course)
> > 1) disjunction
> > 2) recursive shapes
> > 3) closed shapes
>
The problem here is more about identifying the desired semantics than about
how to implement it.

> >
> > Looking at the ShEx implementations I also see *possible* trouble in
> implementing other features in the 'core' language that are already
> implemented in SPIN, RDFUnit & probably ICV
> > 1) type validation (in general but also in terms of disjunction,
> recursive & closed)
>
What do you mean by "type" validation? Do you refer to validation by
"rdf:type" ?

> > 2) Implementing the Constraint violation vocabulary
>
Why? I really don't see why it should be a problem...

> > 2.1) Human readable messages
>
>From my point of view, human-readable messages are difficult to incorporate
in the Recommendation because they will depend on a lot of different
factors. Of course, the easy solution is to let the SPARQL construct query
to generate any kind of data structure...but I think the point is to have
some mechanism to ensure that the messages generated by the different SHACL
processors are the same.


> > 2.2) severity levels
>
I also don't see any problem to add some meta-data to the shapes to signal
their severity level.


> > 3) macro functions (let's see what happens to this one)
>
I don't oppose to add a language construct that allows macro functions. For
me, it looks a feature that may complicate the language, but I would not
oppose to include it.

> > 4) complex constraints
>
Complex constraints can be handled by an extension mechanism similar to the
semantic actions of ShEx.

> >
> > In the current situation I would (personally) be more worried on
> implementations for the core language than for the SPARQL extension
> mechanism.Unless of course we remove more parts from 'core' but then we
> would need a third profile.
>
Just to make it clear, in not suggesting a third profile. Just stating that
> the 'core' profile is not necessarily easier implementable than the 'SPARQL
> extension'
>
I think that's because you are assuming that you are leveraging your
implementation on top of SPARQL.

But If you wanted to implement it from scratch, you would prefer to have a
high-level language with a simple, self contained semantics document
instead of a set of SPARQL queries that in order to understand them, you
need to understand SPARQL's spec.

My point is that although the WG can produce the semantics of the
high-level language in terms of SPARQL, it should also have another formal
semantics of that high-level language independent from SPARQL.

Best regards, Jose Labra

>
> >
> > To sum up and quoting Richard, before deciding to split documents or not
> we should all work together & collaborate.
> > Personally I would be in favor of keeping everything on one place but
> I'm not dogmatic about this. What I suggest is for now keep everything in
> one document to keep it coherent and defer the splitting decision for later.
> >
> > Best,
> > Dimitris
> >
> >
> >>
> >> Holger
> >>
> >>
> >>
> >>> --
> >>> Arnaud  Le Hors - Senior Technical Staff Member, Open Web Technologies
> - IBM Software Group
> >>>
> >>>
> >>> Arthur Ryman <arthur.ryman@gmail.com> wrote on 03/19/2015 09:15:39 AM:
> >>>
> >>> > From: Arthur Ryman <arthur.ryman@gmail.com>
> >>> > To: Richard Cyganiak <richard@cyganiak.de>
> >>> > Cc: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
> >>> > Date: 03/19/2015 09:16 AM
> >>> > Subject: Re: Pragmatic Proposal for the Structure of the SHACL Spec
> >>> >
> >>> > Richard,
> >>> >
> >>> > I am fine with these parts being in one document if that in fact
> >>> > simplifies maintenance. That decision should be delegated to the
> >>> > editors. However, I think we could stabilize Part 1 soon, publish it,
> >>> > show a heartbeat, and get some feedback.
> >>> >
> >>> > I hope we don't have warring editors. Putting a stake in the ground
> by
> >>> > publishing Part 1 might improve our shared understanding of SHACL.
> >>> >
> >>> > -- Arthur
> >>> >
> >>> > On Thu, Mar 19, 2015 at 8:02 AM, Richard Cyganiak <
> richard@cyganiak.de> wrote:
> >>> > > Arthur,
> >>> > >
> >>> > > I agree with your analysis regarding different audiences. I am
> >>> > sympathetic to the notion that there should be three parts.
> >>> > >
> >>> > > However, I disagree with your conclusion that there should be
> >>> > three documents.
> >>> > >
> >>> > > Keeping multiple documents in sync is a significant burden on a
> >>> > working group. It makes sense if the documents describe loosely
> >>> > coupled components of an overall framework, but that is not the case
> >>> > here; your proposed split would leave material related to a single
> >>> > language feature often distributed over three different documents
> >>> > (and perhaps maintained by three warring editors).
> >>> > >
> >>> > > I don’t see why a single specification cannot adequately address
> >>> > the needs of different target audiences.
> >>> > >
> >>> > > The SPARQL 1.1 spec is an example of a specification that delivers
> >>> > a primer, a thorough language reference, precise semantics, and
> >>> > guidance for implementers in a single document.
> >>> > >
> >>> > > Best,
> >>> > > Richard
> >>> > >
> >>> > >
> >>> > >
> >>> > >> On 18 Mar 2015, at 22:29, Arthur Ryman <arthur.ryman@gmail.com>
> wrote:
> >>> > >>
> >>> > >> At present we are witnessing a burst of creative activity. It is
> great
> >>> > >> to see such energy in a WG. However, the result is that we have
> too
> >>> > >> many specs and I doubt that most members can give all these
> documents
> >>> > >> adequate review. We need to focus our resources on fewer specs.
> >>> > >>
> >>> > >> There has also been extended discussion on the role of SPARQL, on
> >>> > >> profiles of SHACL, and on who the target audience is. I'd like to
> >>> > >> propose a pragmatic structure that will enable us to package the
> >>> > >> material in a way that will address our audiences, enable us to
> divide
> >>> > >> the work better, and create a sequence of deliverables.
> >>> > >>
> >>> > >> I propose three levels of content. I'll refer to these as Parts
> 1, 2,
> >>> > >> and 3 in order to defer word-smithing of the titles.
> >>> > >>
> >>> > >> 1. SHACL Part 1. The audience is people who want to use a simple,
> >>> > >> high-level language that can express common constraints. This
> document
> >>> > >> should use precise, natural language to describe the semantics of
> >>> > >> those common constraints that can be expressed using the
> high-level
> >>> > >> vocabulary of SHACL. It should also include simple examples to
> >>> > >> illustrate concepts. It should be readable by people who do not
> know
> >>> > >> SPARQL. It should not refer to SPARQL. It should not define formal
> >>> > >> semantics. It should be possible for this part of SHACL to be
> readily
> >>> > >> implemented in SPARQL or Javascript. We therefore need to limit
> the
> >>> > >> expressive power of this part of SHACL.
> >>> > >>
> >>> > >> 2. SHACL Part 2. The audience is people who want to write custom
> >>> > >> constraints using an executable language. This part defines the
> >>> > >> template/macro mechanism. It also provides normative SPARQL
> >>> > >> implementations of the high-level SHACL language introduced in
> Part 1.
> >>> > >> This part should not contain other formal specifications. The
> SPARQL
> >>> > >> implementations can be regarded as executable formal
> specifications.
> >>> > >>
> >>> > >> 3. SHACL Part 3. The audience is people who want to implement
> SHACL.
> >>> > >> This part should contain a formal specification. We can defer the
> >>> > >> choice of formalism. If we have multiple candidates and willing
> >>> > >> authors, let's do a bake-off.
> >>> > >>
> >>> > >> -- Arthur
> >>> > >>
> >>> > >
> >>> >
> >>
> >>
> >
> >
> >
> > --
> > Dimitris Kontokostas
> > Department of Computer Science, University of Leipzig
> > Research Group: http://aksw.org
> > Homepage:http://aksw.org/DimitrisKontokostas
>
>


-- 
-- Jose Labra
Received on Friday, 20 March 2015 17:48:51 UTC