Re: ShEx relation to SPIN/OWL from Jose Emilio Labra Gayo on 2014-07-03 (public-rdf-shapes@w3.org from July 2014)

From: Jose Emilio Labra Gayo <jelabra@gmail.com>
Date: Thu, 3 Jul 2014 11:07:00 +0200
To: "john.walker" <john.walker@semaku.com>
Cc: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>, John Snelson <John.Snelson@marklogic.com>
Message-ID: <CAJadXX+eeU86OFKGcrKoSy2xs6e0nKQ45Hr8j7oOe4X_21iuMQ@mail.gmail.com>
On Thu, Jul 3, 2014 at 9:18 AM, john.walker <john.walker@semaku.com> wrote:

>   Hi John,
>
>  I know many people who would consider SPARQL to be a declarative
> language, albeit not with the specific purpose of validation.
>  Even with a declarative validation language I would expect, in many
> real-world use cases, there is more than one way to skin a cat.
>

>  I'm not sure I understand your last point about an RDF based syntax, do
> you mean RDF/XML specifically here?
>

There is a prosposal for a RDF representation of ShEx (which could be
Turtle, RDF/XML, etc...). You can see the proposal here:
https://www.w3.org/2001/sw/wiki/ShEx#SHEX.2FRDF_format


>  Personally I think it is pretty cool to have an RDF representation of
> ShEx that could be serialized to any of the concrete RDF syntaxes.
>  Primarily for these reasons:
>  - ShEx could be stored in a graph store
>  - ShEx could be used to validate itself
>  - ShEx could be queried or constructed using SPARQL
>
>  As such it might be useful to think of ShEx language as an extension to
> Turtle (rather than 'inspired by'), similar to how TriG extends Turtle with
> named graphs.
>

I think the main inspirations for ShEx have been: RelaxNG, Turtle and
SPARQL...

Best regards, Jose Labra

>
>  Cheers,
>  John
>
> > On July 2, 2014 at 5:42 PM John Snelson <John.Snelson@marklogic.com>
> wrote:
> >
> >
> > There's a big difference between a declarative validation language like
> > ShEx and a more general purpose language like SPARQL in SPIN to
> > validate. By being declarative and stating the validation intent rather
> > than the validation method, the description is available to be used in
> > many different scenarios.
> >
> > As an example, I _could_ write my RDF validation code in Java running
> > against a triple store - but it would be useless in a number of other
> > contexts. Using a declarative validation language would also allow the
> > description to be used:
> >
> > 1) As a description of my RDF format.
> > 2) To perform streaming validation of RDF on the wire.
> > 3) To guide an efficient binary compression algorithm.
> > 4) To validate the RDF in an HTML document containing RDFa markup.
> >
> > There's great value in a declarative schema language like XML's Relax NG
> > over and above something like Schematron, even though Schematron is
> > strictly more expressive.
> >
> > However I do agree that a human readable syntax is vastly preferable to
> > an RDF based syntax, and drawing inspiration from the SPARQL/Turtle
> > syntax is the most obvious starting point for that.
> >
> > John
> >
> > On 02/07/14 15:55, Dimitris Kontokostas wrote:
> > > As discussed on & off the list OWL & SPARQL are sufficient for
> > > validation in a CWA.
> > > The problem with OWL is the different semantics so people have to
> > > rewrite - most of the times the same things - it in another format /
> > > language such as SPIN / Shex / SPARQL.
> > >
> > > Some remarks:
> > >
> > > * Everything that includes writing RDF manually is not user friendly,
> > > even the Shex / RDF format, however, with the proper interface (e.g.
> > > Tobraid composer) the difference is negligible
> > > * I agree that the SPARQL example is misleading, for example RDFUnit
> > > generates automatically 43 different (SPARQL) test cases for this
> > > specific schema.
> > > o I also think this is the way to go for Shex implementations,
> > > huge SPARQL queries tend to fail / timeout in big graphs
> > > * Normally, the RDF you have already has an owl/rdfs schema thus, part
> > > of those declarations will be defined anyway
> > >
> > > In most cases reusing existing OWL schemas for validation is enough
> e.g.
> > > foaf already defines the domains, ranges and datatypes for all it's
> > > properties
> > > What we need in the end is tools that translate OWL to SPARQL - or to
> > > something intermediate like Shex or SPIN - to get half of the work
> done
> > >
> > > For all other cases we need SPARQL or something that translates to
> SPARQL.
> > > With a proper interface, anything could do :) but if I had to write
> > > something by hand I'd choose the compact syntax.
> > > However, the only problem with "things" that translate to SPARQL is
> that
> > > they do not have the full SPARQL expressiveness, that is all of Shex,
> > > SPIN templates and RDFUnit patterns.
> > > Thus, there will always be a case where we'll have to write a manual
> > > SPARQL query.
> > >
> > > just my 2 cents,
> > >
> > > Best,
> > > Dimitris
> > >
> > >
> > >
> > >
> > > On Wed, Jul 2, 2014 at 4:46 AM, Holger Knublauch <
> holger@topquadrant.com
> > > <mailto:holger@topquadrant.com>> wrote:
> > >
> > > Hi Eric, John,
> > >
> > >
> > > On 7/1/2014 20:31, Eric Prud'hommeaux wrote:
> > >
> > >> I intended ShEx to be as human readable as possible for the use cases
> > >> in question so I take your challenge as a call to compare it to
> > >> equivalent expressions in SPIN/SPARQL and OWL.
> > >
> > > I am attaching a SPIN version of your challenge. The main motivation
> > > for doing this is to demonstrate that it is very well possible to
> > > create human-readable representations while having a maximum of
> > > expressivity (all of SPARQL) and being compatible with a language
> > > that many people already know.
> > >
> > > To start this off, here is a TopBraid Composer screen rendering of
> > > the spin:constraints defined for the class Issue:
> > >
> > >
> > >
> > > You can see that I am using one SPARQL query and three SPIN template
> > > calls.
> > >
> > >
> http://composing-the-semantic-web.blogspot.com.au/2009/01/understanding-spin-templates.html
> > >
> > > The Turtle source code of such a template call looks like this:
> > >
> > > :Issue
> > > spin:constraint [
> > > a spl:ObjectCountPropertyConstraint ;
> > > arg:maxCount 1 ;
> > > arg:property :reportedBy
> > > ] ;
> > >
> > > i.e. it is possible to express a constraint on the maximum
> > > cardinality of a property in just 4 triples (same number as an
> > > owl:Restriction would use).
> > >
> > > At execution time, these template calls are substituted by their
> > > SPARQL implementation (spin:body). Here is how the constraint
> > > template used above is defined:
> > >
> > >
> > >
> > > You can see that the template's body is doing the real work, and is
> > > perfectly reusable across many ontologies. Anyone can create and
> > > publish their own templates in RDF. A good example of such a library
> is
> > >
> > > http://semwebquality.org/mediawiki/index.php?title=SemWebQuality.org
> > >
> > > and TopBraid also includes their own libraries including the SPL
> > > namespace shown above and attached.
> > >
> > > Here is the constraint checking valid phone numbers:
> > >
> > >
> > >
> > > I copied this regex from the internet so I have no idea whether it
> > > is correct, but you get the idea. Internally, this gets executed
> > > using FILTER regex in SPARQL, but the user only needs to select the
> > > template and then fill in the required arguments (here: the property
> > > and the specific regex string).
> > >
> > > The tricky bit of your example is that it requires inferencing to
> > > run before it can find all violations. One inference that I have
> > > implemented here infers the rdf:type of a resource if it uses a
> > > property with an rdfs:domain:
> > >
> > > CONSTRUCT {
> > > ?instance a ?domain .
> > > }
> > > WHERE {
> > > ?property rdfs:domain ?domain .
> > > ?instance ?property ?anyValue .
> > > FILTER NOT EXISTS {
> > > ?instance a ?anyType .
> > > } .
> > > }
> > >
> > > There are many other ways of achieving the same result, e.g. using
> > > an out-of-the-box OWL or RDFS inference engine, but I wanted to make
> > > the example self-contained so this is represented as a SPIN rule.
> > >
> > > To run this yourself, you would need TopBraid Composer Free Edition
> > > 4.4.1 and replace the version of spl with the attached one because I
> > > made some changes for (the yet unpublished) 4.5 version. Then run
> > > SPIN inferences so that issue4 gets its rdf:type. Then press the
> > > Refresh and show problems button. Output should be:
> > >
> > >
> > >
> > > For this run above I actually made the tel: value invalid - the
> > > regex didn't complain about it so maybe it really is a valid URL. I
> > > skipped the complication of foaf:Agent which would probably require
> > > another inference rule. Rest assure that it could be represented
> > > with similar ease.
> > >
> > > Also note that the example file had an error that November 31 did
> > > not exist, so I have corrected that for this demo.
> > >
> > > I am sure it would be possible to fiddle with this example more to
> > > highlight strengths and weaknesses, but I my quick shot was just
> > > meant as an illustration.
> > >
> > >
> > >> = SPARQL =
> > >>
> > >> The ShEx demo also spits out equivalent SPARQL. You can click View as
> > >> <SPARQL query> to see the SPARQL that captures the same semantics. I
> > >> think you'll find it rather daunting to imagine using that as a
> > >> publication format.
> > >
> > > My personal take on your SPARQL example is that nobody would write
> > > such a query. For readability this should be split into multiple
> > > SPARQL queries. SPIN provides a "natural" framework for doing so, by
> > > introducing the concept of attaching rules and constraints to
> > > classes. You will find that the SPIN file looks much less scary than
> > > the SPARQL in your example.
> > >
> > >
> > >> = SPIN =
> > >>
> > >> Spin can add a *this* keyword to the above SPARQL, which would allow
> > >> you to break out the clauses from the SPARQL query produced above. I
> > >> haven't tested an example of this, but perhaps you could provide one
> > >> and we can see what semantics it covers with what syntax.
> > >
> > > Done. I am especially highlighting the importance of SPIN Templates.
> > > There is also a concept of SPIN Functions that allows anyone to
> > > define their own SPARQL functions that encapsulate reusable queries
> > > and produce easier-to-maintain rules and constraints. My arguments
> > > presented in
> > >
> > >
> http://composing-the-semantic-web.blogspot.com.au/2010/04/where-owl-fails.html
> > >
> > > remain valid: it is quite possible to cover most of the
> > > functionality of OWL with SPIN templates, but templates also enable
> > > other medium advanced users to write their own language extensions.
> > > But not everyone will need to do that and they don't even need to
> > > know that SPARQL exists to use template-based SPIN constraints.
> > >
> > > Let me finish by saying that I believe it will be easy to change the
> > > requirements and your example challenge so that other frameworks
> > > than SPIN become severely disadvantaged. SPARQL is very expressive,
> > > so anything involving mathematical operations, string manipulation
> > > etc quickly reaches the limits of other languages.
> > >
> > > Happy to discuss further,
> > > Holger
> > >
> > >
> > >
> > >
> > > --
> > > Dimitris Kontokostas
> > > Department of Computer Science, University of Leipzig
> > > Research Group: http://aksw.org
> > > Homepage:http://aksw.org/DimitrisKontokostas
>



-- 
Saludos, Labra
Received on Thursday, 3 July 2014 09:07:51 UTC