Re: Implementation feasibility (was: Re: Pragmatic Proposal for the Structure of the SHACL Spec) from Dimitris Kontokostas on 2015-03-20 (public-data-shapes-wg@w3.org from March 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 20 Mar 2015 21:14:55 +0200
To: Jose Emilio Labra Gayo <jelabra@gmail.com>
Cc: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a3vi-tyCDBLDE7thpY5QsnObbPyieXLTXVwpU5oA+B9hA@mail.gmail.com>
On Fri, Mar 20, 2015 at 7:48 PM, Jose Emilio Labra Gayo <jelabra@gmail.com>
wrote:

> On Fri, Mar 20, 2015 at 5:54 PM, Dimitris Kontokostas <
> kontokostas@informatik.uni-leipzig.de> wrote:
>
>>
>> On Mar 20, 2015 10:13 AM, "Dimitris Kontokostas" <
>> kontokostas@informatik.uni-leipzig.de> wrote:
>> >
>> >
>> >
>> > On Thu, Mar 19, 2015 at 10:46 PM, Holger Knublauch <
>> holger@topquadrant.com> wrote:
>> >>
>> >> On 3/20/15 2:52 AM, Arnaud Le Hors wrote:
>> >>>
>> >>> I have to differ on the point that the organization of our documents
>> is up to the editors. This is very much a WG decision and there is actually
>> very practical consequences that we need to consider and may influence how
>> we want to organize our documents.
>> >>>
>> >>> One of them is that to become a Recommendation a specification needs
>> to successfully go through Candidate Recommendation and gather enough
>> implementation support (what exactly "enough" means is to be defined by the
>> WG but it's a least 2 implementations). If everything is in one document,
>> we need to get that  whole document to get enough implementation support to
>> become a Recommendation. If we split the document in smaller pieces,
>> various pieces might move along at a different pace.
>> >>>
>> >>> For instance, we might be able to get the higher level set that
>> doesn't require support for SPARQL extensions as a REC sooner.
>> >>>
>> >>> This is also something to consider when thinking about the
>> dependencies between the different layers. With the bottom up approach, if
>> we end up with the higher level having a normative reference to the lower
>> level it won't be able to get to REC until the lower level is. This is also
>> true if the higher level is defined as a "profile" (i.e., a specification
>> that defines a subset of a full spec) of the main spec.
>> >>
>> >>
>> >> The argument about implementation support is not strong enough. As
>> stated TopQuadrant will provide an open source implementation of the full
>> spec that is tracking the progress as soon as we decide to go to FPWD
>> state. In the IRC log, Dimitris has also announced that he will work on
>> another implementation. Finally, while I cannot speak for them, we have at
>> least one SPARQL database vendor in our group (I am not sure what happened
>> to Arthur Keen, I don't see him on the list anymore). This sounds like at
>> least 3 implementations from within the group already, and we still have
>> 1.5 years to go.
>>
>
> Holger,
>

It is Dimitris :)


> you only mention SPARQL based implementations...this contradicts the
> assertion that it will be possible to have non-sparql based
> implementations.
>

The point on this mail was that implementing the 'SPARQL extension part' is
not harder that implementing the 'core part' of the language. See inline
for some comments


>
> At this moment, there are already some implementations that show that
> non-SPARQL based implementations of the core language are feasible.
>
>> >> Speaking personally and strictly in terms of implementation
>> feasibility, I think I could easier implement the currently 'SPARQL
>> extension mechanism' than the currently suggested 'core' language.
>>
>
> It depends on which parts you are planning to implement and which parts
> you are planning to leverage on. If you are planning to implement something
> on top of SPARQL as SPIN, then it is clearly more feasible to follow the
> same road and implement another SPIN implementation.
>
> But what we are proposing is not to limit the implementations to be
> defined on top of SPARQL and to allow other implementations that can be
> done, for example, in plain javascript. In that case, maybe, you would
> consider it much more complicated to have to implement the full SPARQL
> engine just to define the core language.
>
> That's why I advocate to have a core language rich enough to define most
> of the validation use cases and have an optional extensibility mechanism
> similar to ShEx semantic actions that calls a external processor (let it be
> SPARQL, Javascript, or whatever can appear in the future). The benefits are
> that the shapes defined in that high level language will be compatible
> between different implementations.
>
> If the users want the extra functionality that provides SPARQL they can
> still use it, but they will know that in that case their shapes depend on
> those external processors.
>
>> >
>> > The current spec defines templates and function and RDFUnit already has
>> a templating mechanism. It is different and more low-level that the one
>> Holger suggests but I could easily implement it, which leaves only
>> functions which is kind of similar in complexity.
>> >
>> > On the other hand, the core language has features that are harder to
>> implement (depending on the final details of course)
>> > 1) disjunction
>> > 2) recursive shapes
>> > 3) closed shapes
>>
> The problem here is more about identifying the desired semantics than
> about how to implement it.
>

The implementation complexity always depends on the defined semantics

> >
>> > Looking at the ShEx implementations I also see *possible* trouble in
>> implementing other features in the 'core' language that are already
>> implemented in SPIN, RDFUnit & probably ICV
>> > 1) type validation (in general but also in terms of disjunction,
>> recursive & closed)
>>
> What do you mean by "type" validation? Do you refer to validation by
> "rdf:type" ?
>

yes

> > 2) Implementing the Constraint violation vocabulary
>>
> Why? I really don't see why it should be a problem...
>
>> > 2.1) Human readable messages
>>
> From my point of view, human-readable messages are difficult to
> incorporate in the Recommendation because they will depend on a lot of
> different factors. Of course, the easy solution is to let the SPARQL
> construct query to generate any kind of data structure...but I think the
> point is to have some mechanism to ensure that the messages generated by
> the different SHACL processors are the same.
>

For the high level language we could define in advance the exact displayed
messages so that all implementations display the exact same message. Would
this be acceptable?
e.g. for sh:minCount 2 -> cardinality of {property} is lower than 2


> > 2.2) severity levels
>>
> I also don't see any problem to add some meta-data to the shapes to signal
> their severity level.
>

If we want to make SHACL useful, severity levels must be defined at the
property level

> > 3) macro functions (let's see what happens to this one)
>>
> I don't oppose to add a language construct that allows macro functions.
> For me, it looks a feature that may complicate the language, but I would
> not oppose to include it.
>
>> > 4) complex constraints
>>
> Complex constraints can be handled by an extension mechanism similar to
> the semantic actions of ShEx.
>

All these come to my point, things are not easier for implementation for
the core/high level language.

> >
>> > In the current situation I would (personally) be more worried on
>> implementations for the core language than for the SPARQL extension
>> mechanism.Unless of course we remove more parts from 'core' but then we
>> would need a third profile.
>>
> Just to make it clear, in not suggesting a third profile. Just stating
>> that the 'core' profile is not necessarily easier implementable than the
>> 'SPARQL extension'
>>
> I think that's because you are assuming that you are leveraging your
> implementation on top of SPARQL.
>
> But If you wanted to implement it from scratch, you would prefer to have a
> high-level language with a simple, self contained semantics document
> instead of a set of SPARQL queries that in order to understand them, you
> need to understand SPARQL's spec.
>

I prefer not to get into this loop, each WG member has different priorities
/ use cases in mind, so I guess you can also accept the fact that some
people would prefer to built on standards.


> My point is that although the WG can produce the semantics of the
> high-level language in terms of SPARQL, it should also have another formal
> semantics of that high-level language independent from SPARQL.
>

I already said that I would like to have additional formal semantics for
the core language/profile and I re-quote my proposal:

> Personally I would be in favor of keeping everything on one place but I'm
not dogmatic about this. What I suggest is for now keep everything in one
document to keep it coherent and defer the splitting decision for later.

Best,
Dimtiris


>
> Best regards, Jose Labra
>
>>
>> >
>> > To sum up and quoting Richard, before deciding to split documents or
>> not we should all work together & collaborate.
>> > Personally I would be in favor of keeping everything on one place but
>> I'm not dogmatic about this. What I suggest is for now keep everything in
>> one document to keep it coherent and defer the splitting decision for later.
>> >
>> > Best,
>> > Dimitris
>> >
>> >
>> >>
>> >> Holger
>> >>
>> >>
>> >>
>> >>> --
>> >>> Arnaud  Le Hors - Senior Technical Staff Member, Open Web
>> Technologies - IBM Software Group
>> >>>
>> >>>
>> >>> Arthur Ryman <arthur.ryman@gmail.com> wrote on 03/19/2015 09:15:39
>> AM:
>> >>>
>> >>> > From: Arthur Ryman <arthur.ryman@gmail.com>
>> >>> > To: Richard Cyganiak <richard@cyganiak.de>
>> >>> > Cc: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
>> >>> > Date: 03/19/2015 09:16 AM
>> >>> > Subject: Re: Pragmatic Proposal for the Structure of the SHACL Spec
>> >>> >
>> >>> > Richard,
>> >>> >
>> >>> > I am fine with these parts being in one document if that in fact
>> >>> > simplifies maintenance. That decision should be delegated to the
>> >>> > editors. However, I think we could stabilize Part 1 soon, publish
>> it,
>> >>> > show a heartbeat, and get some feedback.
>> >>> >
>> >>> > I hope we don't have warring editors. Putting a stake in the ground
>> by
>> >>> > publishing Part 1 might improve our shared understanding of SHACL.
>> >>> >
>> >>> > -- Arthur
>> >>> >
>> >>> > On Thu, Mar 19, 2015 at 8:02 AM, Richard Cyganiak <
>> richard@cyganiak.de> wrote:
>> >>> > > Arthur,
>> >>> > >
>> >>> > > I agree with your analysis regarding different audiences. I am
>> >>> > sympathetic to the notion that there should be three parts.
>> >>> > >
>> >>> > > However, I disagree with your conclusion that there should be
>> >>> > three documents.
>> >>> > >
>> >>> > > Keeping multiple documents in sync is a significant burden on a
>> >>> > working group. It makes sense if the documents describe loosely
>> >>> > coupled components of an overall framework, but that is not the case
>> >>> > here; your proposed split would leave material related to a single
>> >>> > language feature often distributed over three different documents
>> >>> > (and perhaps maintained by three warring editors).
>> >>> > >
>> >>> > > I don’t see why a single specification cannot adequately address
>> >>> > the needs of different target audiences.
>> >>> > >
>> >>> > > The SPARQL 1.1 spec is an example of a specification that delivers
>> >>> > a primer, a thorough language reference, precise semantics, and
>> >>> > guidance for implementers in a single document.
>> >>> > >
>> >>> > > Best,
>> >>> > > Richard
>> >>> > >
>> >>> > >
>> >>> > >
>> >>> > >> On 18 Mar 2015, at 22:29, Arthur Ryman <arthur.ryman@gmail.com>
>> wrote:
>> >>> > >>
>> >>> > >> At present we are witnessing a burst of creative activity. It is
>> great
>> >>> > >> to see such energy in a WG. However, the result is that we have
>> too
>> >>> > >> many specs and I doubt that most members can give all these
>> documents
>> >>> > >> adequate review. We need to focus our resources on fewer specs.
>> >>> > >>
>> >>> > >> There has also been extended discussion on the role of SPARQL, on
>> >>> > >> profiles of SHACL, and on who the target audience is. I'd like to
>> >>> > >> propose a pragmatic structure that will enable us to package the
>> >>> > >> material in a way that will address our audiences, enable us to
>> divide
>> >>> > >> the work better, and create a sequence of deliverables.
>> >>> > >>
>> >>> > >> I propose three levels of content. I'll refer to these as Parts
>> 1, 2,
>> >>> > >> and 3 in order to defer word-smithing of the titles.
>> >>> > >>
>> >>> > >> 1. SHACL Part 1. The audience is people who want to use a simple,
>> >>> > >> high-level language that can express common constraints. This
>> document
>> >>> > >> should use precise, natural language to describe the semantics of
>> >>> > >> those common constraints that can be expressed using the
>> high-level
>> >>> > >> vocabulary of SHACL. It should also include simple examples to
>> >>> > >> illustrate concepts. It should be readable by people who do not
>> know
>> >>> > >> SPARQL. It should not refer to SPARQL. It should not define
>> formal
>> >>> > >> semantics. It should be possible for this part of SHACL to be
>> readily
>> >>> > >> implemented in SPARQL or Javascript. We therefore need to limit
>> the
>> >>> > >> expressive power of this part of SHACL.
>> >>> > >>
>> >>> > >> 2. SHACL Part 2. The audience is people who want to write custom
>> >>> > >> constraints using an executable language. This part defines the
>> >>> > >> template/macro mechanism. It also provides normative SPARQL
>> >>> > >> implementations of the high-level SHACL language introduced in
>> Part 1.
>> >>> > >> This part should not contain other formal specifications. The
>> SPARQL
>> >>> > >> implementations can be regarded as executable formal
>> specifications.
>> >>> > >>
>> >>> > >> 3. SHACL Part 3. The audience is people who want to implement
>> SHACL.
>> >>> > >> This part should contain a formal specification. We can defer the
>> >>> > >> choice of formalism. If we have multiple candidates and willing
>> >>> > >> authors, let's do a bake-off.
>> >>> > >>
>> >>> > >> -- Arthur
>> >>> > >>
>> >>> > >
>> >>> >
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Dimitris Kontokostas
>> > Department of Computer Science, University of Leipzig
>> > Research Group: http://aksw.org
>> > Homepage:http://aksw.org/DimitrisKontokostas
>>
>>
>
>
> --
> -- Jose Labra
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Friday, 20 March 2015 19:15:51 UTC