Re: Implementation feasibility from Jose Emilio Labra Gayo on 2015-03-21 (public-data-shapes-wg@w3.org from March 2015)

From: Jose Emilio Labra Gayo <jelabra@gmail.com>
Date: Sat, 21 Mar 2015 06:16:01 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>, Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CAJadXX+o-ah-BKcHau=B8Dm=a77wc_bwiM1Pc2i4RV95YxpcoQ@mail.gmail.com>

>
> >
> >> There are several implementations for ShEx, which is a similar language
> >> to the one described there.
>
> ShEx has exclusive or, the core has inclusive or.  This is a significant
> difference.
>

It has already been said that the people behind ShEx are also members of
this WG and that we are open to adapt the language. In the case of "or" I
am more in favor of having both language constructs, "sh:or" for
disjunction and "sh:oneOf" for exclusive or.

What we are discussing at this point is about the methodology of how to
proceed with the spec, my proposal is precisely to identify which are the
most interesting language constructs that can be included in the SHACL
high-level language.

At this point I would suggest to be more inclusive than exclusive,
identifying the most interesting language constructs and giving them names
(something like the table you created and Eric's proposal) so we can have
an understanding of which are those constructs.

It will be important to know how those language constructs interact between
each other, and then we can even discourage some features whose interaction
can lead to very bad performance or even contradictory shapes. That's what
can be done by defining of SHACL profiles.


> How well do they work on large RDF graphs?
> >
> >
> >> It depends on what you call "large RDF graphs" and on what you call
> >> "work well".
>
> Let's say tens of millions of triples and validation times for a single
> shape roughly as fast as the equivalent SPARQL query would take.
>

I don't see why there would be any difference to attack that problem if the
SHACL spec is defined in terms of SPARQL or if it is defined as a high
level language independent of SPARQL.

On the contrary, if the WG promotes the appearance of independent
implementation of SHACL based on that high-level language constructs and
that can attack some of the SHACL profiles I am sure those implementations
could be optimized to have better performance than the implementations that
have to support the full SPARQL engine.

>> The WG can promote the appearance of independent implementations which
> >> do not depend on SPARQL or it can prohibit them by saying that in order
> >> to implement SHACL one needs a SPARQL engine.
>
> But no one is saying that to implement the SHACL core one needs a SPARQL
> engine.


At this moment, there is no definition of what "SHACL core" is.

>From my point of view, some people are only concerned to impose a
particular implementation based on SPARQL, while I propose to identify a
high-level language with extensions that allow to define complex
constraints based on SPARQL.

If you want a full SHACL that can be implemented without the
> equivalent of a SPARQL engine then you should be proposing alternative
> extension mechanisms.
>

I have already suggested them in other threads.

>From my point an extension mechanism similar to ShEx semantic actions can
be included in the SHACL high-level language.

The mechanism allows the inclusion of an action that has a language
identifier and some code. The language identifier can be SPARQL,
javascript, or whatever and if the SHACL validator has support for that
external language processor it calls it passing the code. You can see some
examples here [1]

I think Eric didn't add it to his latest proposal because he was just
trying to be conservative and include only the most basic language
constructs. We have found that such a mechanism offers enough flexibility
to handle complex constraints without imposing a particular implementation.

>From my perspective Eric's proposal can be used as a first step towards
identifying the main high-level language constructs, but it doesn't mean
that those constructs should be the only ones. That's why I think your
table in the wiki was also a good step forward to identifying those
language constructs.

Best regards, Jose Labra

[1] Validating and Describing Linked Data Portals using RDF Shape
Expressions, Jose Emilio Labra Gayo, Eric Prud'hommeaux, Harold Solbrig, ,
1st Workshop on Linked Data Quality, Sept. 2014, Leipzig, Germany
PDF: http://labra.github.io/ShExcala/papers/ldq2014.pdf
Slides: http://www.slideshare.net/jelabra/linked-dataquality-2014

>
> >> Best regards, Jose Labra
>
> peter
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVDGSTAAoJECjN6+QThfjz/uQIAMP5cMJCw5ajJNSj8P/w+Cwp
> lCR4SGfLRP3PIyxO7gRicm4HuI+bO4AqfEKrXgfBa5JrdwSCs7wsj/pByb5paTQV
> xWhRPVnWhq2SusED5+gFHjINLSy0ZvjcOcRZrpWRPFyxUi7ASAUQCKxLayJQ2hj5
> e2TcqnHtW0Xoeitfv/44EZQIE9RW2/MZ9EwVRixerGenLSP6pQ7YLC5vna2Sz3VG
> yF9hamXRXgVEKeXFbRCObbWBcDEphf0RTMUR/RU8vYhz91g1icjcMIM+VhCCvNdj
> Nqnj/aixpt5Cq4EcL0axZ9vu4koKidTEuuKs+N4XG+Bda6HNSm06Yh63JTO5c3Q=
> =QWrJ
> -----END PGP SIGNATURE-----
>



-- 
-- Jose Labra

Received on Saturday, 21 March 2015 05:16:49 UTC