- From: Paul Hermans <paul@proxml.be>
- Date: 15 Jul 2014 19:48:30 +0200
- To: Jerven Tjalling Bolleman <jerven.bolleman@isb-sib.ch>
- Cc: <public-rdf-shapes@w3.org>
- Message-Id: <D60CE67C-EDE8-4D46-B282-4280F1986D5A@proxml.be>
Jerven, >the two widely adopted solutions >in industry SPIN (SPARQL) and OWL closed worlds Do you have facts on this? How many users are we talking? Paul Kind Regards, Paul Hermans ------------------------- OpenCube – Linked Open Statistical Data - http://opencube-project.eu/ > On Jul 15, 2014, at 5:35 PM, Jerven Tjalling Bolleman <jerven.bolleman@isb-sib.ch> wrote: > > > Dear All, > > > Let me apologize in advance for the rude tone of this e-mail. > > > I am looking at the current work/direction of the work-group and am > really worried. > > > Issues > > > First off all you decided not to focus on the problem of validating data > in RDF but on a solution called shapes. I think you need to go back and > collect what validation should do first instead of what the solution > looks like. Because I don't think ShEx/Shapes does enough. > > > Secondly I have the feeling that the work-group is confounding the issue > of syntax and user interfaces as well as ignoring a lot of engineering > effort out there in the world. > > > Concerns > > > My current concerns are mostly based on this document > http://www.w3.org/2013/ShEx/Primer. > > > Concern 1. > > > First of all its yet another syntax with strange variations on turtle. > > > <IssueShape> { > ex:state (ex:unassigned ex:assigned), > ex:reportedBy @<UserShape>, > ex:reportedOn xsd:dateTime, > ( ex:reproducedBy @<EmployeeShape>, > ex:reproducedOn xsd:dateTime )?, > ex:related @<IssueShape>* > } > > > Why the brackets and @ for some kind of pointers? why not make nice and > simple turtle and do this? > > > > > :IssueShape > ex:state (ex:unassigned ex:assigned) ; > ex:reportedBy [ a :UserShape ] ; > ex:reportedOn xsd:dateTime ; > ( ex:reproducedBy :EmployeeShape ; > ex:reproducedOn xsd:dateTime )?, > ex:related :IssueShape > > > Ok we still have the strange '?' and a collection with meaning different > to turtle, let me come to that > > > Now change that to > > > :IssueShape > shex:oneOf ( [] ex:state ex:unassigned . > [] ex:state ex:assigned ) ; > ex:reportedBy [ a :UserShape ] ; > ex:reportedOn xsd:dateTime ; > shex:eitherNoneOforAllOf [ ex:reproducedBy :EmployeeShape ; > ex:reproducedOn xsd:dateTime ] , > ex:related :IssueShape > > > Now is that completely different in readability? > No its not. > Did you gain a lot of usability by yet another syntax? > No you didn't. > Will you make life difficult for everyone using it because you have yet > another syntax? > Yes you did. > Did your syntax make life a lot easier for users? > No, because its yet another syntax to learn. > > > Aside: > Did you notice that your use of the question mark is not consistent > with any other commonly used syntax e.g. regex, globs, trinary logic > etc.. For sure leading to a lot of confusion. > > > Concern 2. > > > The second issue is that because the work-group seems to have confounded > User Interface with constraints interchange. They have forgotten where > all the engineering and much of the training effort has gone in the last > few years. Why is SPARQL 1.1. not the majority of the solution? Why are > you not building on OWL where it is needed. > > > The ShEx already shows that you can't solve the problems because you are > punting to other languages including SPARQL. Meaning that your users > still need to use SPARQL anyway! A major issue IMHO. > > > Concern 3. > > > Shapes is not enough for real world data validation. I have worked for a > while on dutch healthcare systems and had to deal with the fact that > data in the database could be incorrect and data that is provided might > be correct and we need to have humans in the loop to figure out the > truth of it e.g. two people with the same citizen service number (BSN) > (due to typo or fraud). ShEx can tell us that we have an issue but it > can't generate a work item. > > > A thing that for example SPIN can do. Because SPIN is not just a > constraint language but also a inference language. (e.g. I can infer > that a manual data intervention is required given two people with the > same BSN). OWL can do similar things. > > > Concern 4. > > > Because data and rules do not have the same syntax or model it is > difficult to write rules about your rules. Something that is trivial in > SPIN and really helps rule maintenance. e.g. checking that all > predicates mentioned in your rules are present in a limited set of > ontologies is easy in SPIN. Its hard in ShEx because your model is not > quite simple to translate to RDF. > > > Concern 5. > > > As you disregard SPARQL you disregard SERVICE calls. This means I can't > easily have validation using data in multiple systems. Looking at data > as files you process in isolation you have lost a lot of power. As well > as an easy way to extend the capabilities of the system in standard > compliant ways (e.g. using a SADI service to compute values needed in > your validation on the fly) > > > > > Conclusions. > > > ShEx -> SPARQL is fine, places ShEx as a UI not as a interchange standard. > ShEx -> is not powerful enough to do more than simple validation. > ShEx -> Should not invent yet another syntax. ShEx should be modeled in > RDF and use existing syntaxes. > Workgroup -> you to quickly discarded the two widely adopted solutions > in industry SPIN (SPARQL) and OWL closed worlds on the outcome of a > single workshop. > Workgroup -> you don't have a good goal document to states what > validation needs to do. > > > I hope you will seriously reconsider your chosen direction because it is > breaking the first rule of a good standard -> depend on other existing > standards. > > > Regards, > Jerven Bolleman > > > > > > > > > > >
Received on Tuesday, 15 July 2014 17:49:02 UTC