I think that the topic is raising a lot of interest and the activity could lead to very interesting results.

I will be very glad to see progresses in the discussion and the work start with so many differentideas.
I apologize to be out of office until mid August, but I will read with much interest progresses in the discussion.

On 01/07/2014 12:31, Eric Prud'hommeaux wrote:
* john.walker <john.walker@semaku.com> [2014-07-01 11:23+0200]

This is my first message to this list, so I'll give a brief into.

I'm John Walker and I work at a small (but growing :) company called Semaku [1].
We're working in the area of product information management and publication. We
make extensive use of RDF and related technologies for data integration and all
the other Linked Data goodness.
Welcome to the list; your healthy skepticism about yet-another-syntax
is very much appreciated!

I'm really interested in the topic of validation of RDF 'graph' data and very
happy this working group is tackling the subject.

Up to now much of our work has been on the data integration side, where
typically we take some XML data source and convert it to RDF. In such a case
normally we'd use XSD / DTD / Relax NG to validate the XML and have some
validated transformation that (hopefully) results in the desired RDF. However
often we see data quality issues that are not necessarily validated by the
schema. Also often quality is a somewhat gray area open for interpretation
rather than a hard pass/fail.

Additionally we see a shift towards using RDF as the data source whereby an
application is directly manipulating the native RDF, so there is no XML/SQL
schema for the data that can be used for validation. In this case something like
ShEx would be extremely useful.

As a n00b to the group I'd be interested to hear some more about the motivation
to introduce a(nother) new syntax for describing these schema. In particular why
not build on existing standards like SPIN and OWL to describe these rules i.e.
what does ShEx offer that SPIN or OWL does not?

Personally I'm a little skeptical to introduce yet *another* syntax without very
good cause as it raises the (already high) bar for adoption of RDF even higher.
I intended ShEx to be as human readable as possible for the use cases
in question so I take your challenge as a call to compare it to
equivalent expressions in SPIN/SPARQL and OWL. In preparation for a
workshop on RDF Validation [RVAL], I put together a simple use case to
make sure that folks were trying to solve the same class of problem

= ShExC =

The ShEx demo [DEMO] page has a slightly more detailed example from
[SOTA]. On the left is a (commented) schema in ShEx Compact Syntax
and on the right, some sample data.

= OWL =

Evren Sirin provided a detailed demonstration of how validation can be
done with a unique name assumption in a closed world. There was some
pushback that doing so changed the semantics of OWL that it shouldn't
be called OWL, but that doesn't detract from your point about re-using
syntax. You can compare the input OWL input [STAR] to the top part of
the ShExC schema.


The ShEx demo also spits out equivalent SPARQL. You can click View as
<SPARQL query> to see the SPARQL that captures the same semantics. I
think you'll find it rather daunting to imagine using that as a
publication format.

= SPIN =

Spin can add a *this* keyword to the above SPARQL, which would allow
you to break out the clauses from the SPARQL query produced above. I
haven't tested an example of this, but perhaps you could provide one
and we can see what semantics it covers with what syntax.

[RVAL] http://www.w3.org/2012/12/rdf-val/
[SOTA] http://www.w3.org/2012/12/rdf-val/SOTA
[DEMO] http://www.w3.org/2013/ShEx/FancyShExDemo?schemaURL=Examples/Issue-simple-annotated.shex&dataURL=test/Issue-pass-date.ttl&colorize=1
[STAR] http://www.w3.org/2012/12/rdf-val/SOTA#file-sota-constraints-ttl



[1] http://semaku.com


dott. Oreste Signore
cell. +39-348-3962627
Skype: orestesignore
home page: http://www.weblab.isti.cnr.it/people/oreste/