- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Fri, 11 Mar 2016 09:41:01 -0800
- To: Irene Polikoff <irene@topquadrant.com>, Holger Knublauch <holger@topquadrant.com>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
At some point there will have to be a reference implementation, I agree. peter On 03/10/2016 04:37 PM, Irene Polikoff wrote: > I agree that many of these are implementation issues, but then having the > implementation is very important - it shows that the proposal is indeed > viable, otherwise, it is all a bit hypothetical and here-say. Invariably, > implementation work uncovers issues (some smaller, some larger) that often > lead to the revisions of the proposal. Such incremental revisions tend to > add complexity and what looked clean and streamlined in the beginning > often starts to become considerably more convoluted. > > Peter, are you planning to create a reference implementation for this to > actually prove the viability of your proposal? > > Irene > > > On 3/10/16, 6:54 PM, "Peter F. Patel-Schneider" <pfpschneider@gmail.com> > wrote: > >> Here are responses to some of the points that Holder makes. >> >> peter >> >> >> 2/ The current SHACL syntax does not nicely handle some common examples. >> >> Consider a shape limiting a person's guru to be both a person and a >> preacher. The Simplest current Way of doing this is something like >> ex:foo a sh:Shape ; >> sh:property [ sh:predicate ex:guru ; >> sh:class ex:Person ] ; >> sh:property [ sh:predicate ex:guru ; >> sh:class ex:Preacher ] . >> In my proposal this would be >> ex:foo a sh:Shape ; >> sh:property ( ex:guru [ sh:class ex:Person; sh:class ex:Preacher ] ) . >> The current syntax results in shapes that are harder to analyze by tools. >> >> Consider a shape limiting the form of a SSN. Right now this requires >> something like >> ex:foo a sh:Shape ; >> sh:property [ sh:predicate ex:guru ; >> sh:pattern "[0-9]*" ] . >> My proposal is very similar >> ex:foo a sh:Shape ; >> sh:property ( ex:guru [ sh:pattern "[0-9]*" ] ) . >> However, to figure out what is going on in the current syntax requires >> looking for the flags property, also not so simple for tools. >> >> 5/ Merging constraints and shapes does not limit the places where severity >> and other information can be attached. >> >> 9,10/ I agree that paths add a lot of complication both for implementing >> constraints and for other toos. I added them to see how complex they >> would >> be. The proposal does not depend on paths. I will indicate where the >> changes would be. >> >> 11/ Even though RDF requires that subjects of triples are not literals, >> there is no reason to forbid literal-only constructs in places where >> literals can not appear. For conforming RDF graphs these will always be >> false but for extended RDF graphs they will do the right thing. SPARQL >> itself permits literals as the subject of triple patterns so that it will >> work well with extended RDF graphs. >> >> 12/ If a construct like sh:minCount needs to know whether it is in the >> object of a sh:property or an sh:inverseProperty, then that is problematic >> in the current situation. How is it to know? >> >> 13/ In sh:fillers ( ex:property [ sh:minCount 1 ] ), the sh:minCount does >> work alone, without knowledge about its context. All that it is saying >> that >> there is at least one of whatever. If an implementation needs to know >> about >> the context then that is something to be fixed or worked around. >> >> 17/ Optional parameters complicate matters no matter how they are set up. >> Putting the parameters together requires unpacking them. Having two >> properties requires finding them both. >> >> 19,20/ These appear to be implementation issues. >> >> 22/ I believe that Arthur was against using metaclasses in the metamodel, >> i.e., that a template was a metaclass and a constraint was a class. In >> this >> proposal templates are classes, not a metaclass. Templates are also >> properties, so the IRI of the template can be directly used in shapes. >> This >> has the added benefit of tieing the template property to the template. In >> the current setup instead separate bits of the template are required to >> state which properties carry its meaning. This has a problem if two >> templates use the same properties. Which template is to be used then? >> >> 23/ I do not limit Functions to a single argument. The information passed >> in is the list of the arguments, which can be split up in the SPARQL code. >> >> 25/ I am not using list positions to "encode logic". I am using list >> positions for syntax, so as to make the syntax more compact. Even most >> logics uses list positions in their syntax. If compactness is not a >> desirable feature then changing to a object-like syntax is simple. >> >> 26/ Conceptually an expression like sh:minCount does not need to work on a >> focus node + path combination. All that it really needs to know is the >> fillers of the path so that it can count them. >> >> >> On 03/10/2016 03:10 AM, Holger Knublauch wrote: >>> I took a reasonably in-depth look at >>> >>> >>> https://www.w3.org/2014/data-shapes/wiki/ISSUE-95:_Metamodel_simplificati >>> ons#Proposal_4 >>> >>> >>> and below is my feedback. >>> >>> Summary: I don't regard anything in this proposal as an improvement over >>> proposal 3. IMHO it presents a massive step backwards for both users of >>> the >>> core language and the advanced features. If there are ideas worth >>> harvesting >>> then these should be raised and examined individually. I support >>> re-opening >>> ISSUE-41 as suggested by Simon for the paths topic, and to generalize >>> sh:and/or/not so that they can directly point at sh:Constraints instead >>> of >>> just shapes. >>> >>> HTH >>> Holger >>> >>> >>> General Problems >>> >>> 1) Proposal 4 is poorly motivated. As Peter stated himself, he started >>> this >>> effort to simplify the metamodel. He made changes to the end-user >>> visible >>> syntax in order to "simplify" the metamodel. However, there was no >>> problem >>> with the end-user visible syntax to begin with. There was no need to >>> change >>> it, and the new syntax is a step backwards. The metamodel is far less >>> important than the user-facing syntax. >>> >>> 2) The syntax changes seem to reflect Peter's world view that SHACL >>> should >>> only be a constraint checking language, not used to describe data or >>> even as >>> "a modeling language". The syntax changes have made the model less >>> predictable, and harder to use by algorithms such as form builders, >>> without >>> adding expressivity for constraint checking. >>> >>> 3) There is no experience with this syntax. We need to redo all >>> evaluation, >>> repeat experiments, even revisit every single already closed ISSUE >>> whether it >>> is still valid under the new approach. External observers of SHACL will >>> be >>> upset that we made such changes so relatively late in the process. Such >>> a >>> drastic change will set us back by months. We'll likely need another >>> face to >>> face meeting. The arguments to justify all this are extremely weak. >>> Meanwhile >>> we will be losing a lot of time just debating something that I consider >>> a >>> non-starter. It would be much more productive to look at some key >>> aspects of >>> where Peter believes we could do better and work on incremental >>> improvements, >>> i.e. harvest some ideas that we agree on, instead of creating a >>> completely new >>> language. >>> >>> >>> On merging Shapes and Constraints >>> >>> 4) There is nothing conceptually difficult about the current metamodel, >>> and >>> there was no need to change it. Shapes are a collection of constraints >>> and >>> define a scope. Constraints restrict the focus node, possibly following >>> properties. That's basically it. Shapes are similar to class >>> definitions and >>> intuitive to understand for most people. Merging these concepts blurs >>> the >>> lines, for no convincing reason. I expect that future use cases of >>> Shapes will >>> involve rules via a property such as shr:rule. Shapes serve as an >>> entity to >>> group focus nodes, and this role is independent of constraints. >>> >>> 5) If Shapes are constraints then we are just repeating the same >>> mistake with >>> making sh:closed an attribute of the shape: We lose the ability to >>> specify >>> severity and other things. Basically, it has become impossible (or >>> arcane) to >>> specify different (node) constraints with different severity. For this, >>> constraints need to be objects attached to the shape. Alternatively >>> you'd need >>> shapes pointing at sub-shapes, but then you end up with different >>> syntaxes for >>> the same thing. >>> >>> 6) If the main motivation for linking shapes and constraints was >>> syntactic >>> sugar, then we could make plenty of other incremental changes, such as >>> allowing the values of sh:and/sh:or to be sh:NodeConstraints, not just >>> Shapes, >>> or generalize sh:valueShape into sh:valueConstraint, pointing at >>> constraints >>> directly. >>> >>> >>> On property/inverseProperty vs generalized paths >>> >>> 7) Paths can already be handled (in a very controlled form) using >>> sh:valueShape and derived values. >>> >>> 8) The syntax for inverse properties becomes very ugly and inconsistent >>> with >>> how forward properties are represented: >>> >>> ex:MyShape >>> sh:fillers ( [ sh:inverse ex:parent ] [ sh:minCount 1 ] ) ; >>> sh:fillers ( ex:parent [ sh:minCount 1 ] ) . >>> >>> 9) Path expressions cause a lot of new complexity, computationally, >>> syntactically, for SPARQL generation etc. >>> >>> 10) Path expressions make static analysis (for things like form >>> generation and >>> structural checking of a shapes model) almost impossible. If an >>> arbitrary path >>> can show up where we previously only had simple predicates, then a lot >>> of >>> extra checking and branching needs to happen to make sense of the >>> situation. >>> >>> 11) It is incorrect to claim that all constraint types can be used in >>> combination with every path. For example, sh:minInclusive does not >>> apply to >>> inverse properties. The current metamodel and proposal 3 can express >>> this >>> using standard techniques (classes such as >>> sh:InversePropertyConstraint), but >>> Proposal 4 throws everything together and this ability is lost. As a >>> result, >>> tools cannot provide guidance about which values can actually be >>> entered when. >>> >>> 12) Some constraint types require different SPARQL queries (or >>> JavaScript or >>> whatever) depending on the direction of a property (or even worse, for >>> an >>> arbitrary path). For example sh:minCount needs to count subjects versus >>> objects. Proposal 4 does not even talk about this and no example of >>> SPARQL >>> generation is given. Not all constraint types are of the simple >>> allValuesFrom >>> pattern implemented by NodeValidationFunctions. >>> >>> 13) In cases like sh:fillers ( ex:property [ sh:minCount 1 ] ) the >>> "shape" >>> with the minCount is no longer working stand-alone, but it requires >>> knowledge >>> about its context (e.g. the specific path that was used) to work >>> correctly. >>> This is unclear and adds unnecessary complexity. It is an unnecessary >>> construct to have objects that change their meaning depending of their >>> parent >>> resource. >>> >>> >>> On the constraint types limited to a single property only >>> >>> 14) This is a particularly poorly motivated change that goes backwards: >>> in >>> order to accommodate a "simplification" of the metamodel, the syntax was >>> changed and an unfounded claim is used that "multiple parameters are a >>> poor >>> syntax". The example in ISSUE-133 is skewed to give the impression that >>> a real >>> problem exists: >>> >>> [ a sh:Propertyonstraint ; >>> sh:pattern "http:*" ; >>> sh:predicate ex:httpURL ; >>> sh:datatype xs:string ; >>> sh:minCount 1 ; >>> sh:maxCount 1 ; >>> sh:flags "i" ] >>> >>> If your concern is readability of the source code, why would anybody put >>> sh:pattern and sh:flags so far apart? This is ridiculous. Just write >>> >>> [ a sh:Propertyonstraint ; >>> sh:pattern "http:*" ; sh:flags "i" ; >>> sh:predicate ex:httpURL ; >>> sh:datatype xs:string ; >>> sh:minCount 1 ; >>> sh:maxCount 1 ] >>> >>> and problem solved. If you are not editing the Turtle, then of course >>> it is a >>> matter of tool support, and any reasonable tool will of course group >>> those >>> parameters visually together. We even have sh:group and sh:order >>> attributes >>> for those purposes, and the ConstraintTypes bundle together their >>> parameters >>> in Proposal 3. The same information can (and will) be used by editing >>> tools >>> that write Turtle files. >>> >>> 15) With single-parameter constraint types, and the need to use reified >>> objects or list parameters whenever you need to pass in multiple values >>> instead, the labeltemplate and sh:message templates become useless as >>> there is >>> no general mechanism to access the nested parameter values. They just >>> become >>> random objects and lists. >>> >>> 16) If multiple parameters are needed, the problem of defining and >>> using them >>> is just shifted by one level. For example, proposal 3 has a uniform and >>> integrated syntax to define parameters. If you just point at an object >>> then >>> you need to talk (elsewhere) about the constraints on those objects. >>> This is >>> inconsistent, verbose, unmaintainable and not user friendly at all. >>> >>> 17) There is no uniform syntax for parameters anymore. Some are just >>> plain >>> values, others are lists, others are objects. Consider the case of >>> sh:pattern. >>> In Proposal 4, the values of sh:pattern are either a string or a list >>> where >>> the first value is a string and the second another string, with a >>> different >>> meaning. Imagine having to write code, editors or even a SPARQL query >>> for >>> that. You'll end up with complicating UNIONs and ORs everywhere just to >>> handle >>> the variations due to the metamodel "simplifications". >>> >>> 18) If you need parameter objects to pass in multiple logical >>> parameters, then >>> you basically *always* need access to the $shapesGraph. Peter was >>> strongly >>> against this for ages, and made a lot of noise about that. Now he has >>> completely reverted his position, just to accommodate his >>> "simplification", >>> and to even make it possible at all. >>> >>> 19) If you need parameter objects to pass in multiple values, every >>> SPARQL >>> implementation of such a constraint type will first need to start with >>> a block >>> to retrieve all the real parameters that are nested in the object or >>> list. >>> Compare: >>> >>> WHERE { >>> GRAPH $shapesGraph { >>> $myParam ex:value1 ?value1 . >>> OPTIONAL { >>> $myParam ex:value2 ?value2 . >>> } >>> } >>> $this $predicate ?object . >>> FILTER (doSomething(?object, ?value1) || (bound(?value2) && >>> soSomethingElse(?object, ?value2)) >>> } >>> >>> versus the current syntax: >>> >>> WHERE { >>> $this $predicate ?object . >>> FILTER (doSomething(?object, $value1) || (bound(?value2) && >>> soSomethingElse(?object, $value2)) >>> } >>> >>> 20) Related to point 19) above, you will have a combinatorial explosion >>> of >>> parameters if you have multiple OPTIONAL blocks. This will sometimes >>> require >>> nested SELECT DISTINCTs etc. >>> >>> 21) Proposal 4 separates the "shape" of a constraint type from its >>> actual >>> definition. This is verbose and harder to maintain. Proposal 3 handles >>> this >>> much more elegantly, where the constraint type itself doubles as a >>> shape, and >>> sh:parameter is basically a property constraint (pending the choice of >>> various >>> options). No need for separate shapes. >>> >>> 22) sh:ComponentTemplate in Proposal 4 mixes rdf:Property and sh:Shape. >>> One of >>> the main points of criticism from Arthur (and others I believe) was >>> that my >>> proposal used metaclasses. Here something very similar happens again. >>> >>> 23) Show stopper: Proposal 4 also limits Functions to just a single >>> parameter, >>> and claims that parameter objects can be passed into the function >>> instead. >>> This is not working, because it is not practically possible to >>> manipulate the >>> shapes graph prior to every function invocation. For example >>> ex:myFunction(2, >>> 3) would become ex:myFunction(ex:args) where [ ex:args sh:arg1 2 ; >>> sh:arg2 3 >>> ]. This cannot work for cases such as ex:myFunction(2, ?value). Fixing >>> this >>> would cause an inconsistency in the way that functions vs other >>> parameterizables are defined. Proposal 3 handles all these consistently. >>> >>> >>> Miscellaneous >>> >>> 24) The new syntax is not more user friendly at all, e.g. the proximity >>> of >>> sh:fillers vs sh:filter. What is a "filler" anyway? The existing syntax >>> from >>> Proposal 3 is very similar to Resource Shapes and OWL (restrictions), >>> both >>> have user experience and there was no need to switch to something like >>> sh:fillers. >>> >>> 25) Show stopper: Using list positions to encode logic is a very bad >>> anti-pattern. The syntax >>> >>> sh:fillers ( ex:myProperty [ sh:minCount 1 ] ) >>> >>> may superficially look more compact, but it violates any established >>> design >>> pattern in either RDF or object-orientation. If something is a "path", >>> then >>> call it "path" in the data model. If something is a shape then call it >>> such, >>> even if the Turtle becomes a bit longer: >>> >>> sh:fillers [ sh:path ex:myProperty ; sh:shape [ sh:minCount 1 ] ) . >>> >>> Just for the sake of it, following this "design pattern" someone could >>> model a >>> Person record as an rdf:List: >>> >>> ( "John" >>> "Doe" >>> "1971-07-07"^^xsd:date >>> ex:USA ) >>> >>> Following your approach, if someone has multiple first names, make a >>> nested list >>> >>> ( ("John" "Edward" ) >>> "Doe" >>> "1971-07-07"^^xsd:date >>> ex:USA ) >>> >>> The "beauty" of your syntax fades quickly if you ever use this in other >>> formats such as JSON-LD: >>> >>> [ [ "John", "Edward" ], >>> "Doe", >>> { "@value" : "1971-07-07", "@type" : >>> "http://www.w3.org/2001/XMLSchema#date" }, >>> { "@id" : "ex:USA" } ] >>> >>> The problem here is that lists don't allow you to create @contexts. A >>> better >>> JSON-LD syntax, using normal named properties instead of lists would be: >>> >>> { "firstNames": [ "John", "Edward" ], >>> "lastName" : "Doe", >>> "dob" : "1971-07-07", >>> "country": "ex:USA" ] >>> >>> So, creating an RDF vocabulary just so that it looks good in Turtle is >>> a very >>> bad idea. While the Person example above is for illustration purposes, >>> the >>> same issue happens for every sh:filler scenario and will happen with >>> custom >>> extensions too. >>> >>> Needless to say, such rdf:Lists are almost impossible to use in SPARQL >>> or any >>> query-based approach. >>> >>> 26) The claim that a simple sh:sparqlTemplate per componentTemplate is >>> sufficient is incorrect, because some templates need to operate on the >>> results >>> of path expressions (e.g. sh:class) while others need to look at the >>> full >>> focus node + path combination. There is no vocabulary to encode these >>> differences that could be used by an implementation. It would require a >>> novel >>> text-insertion mechanism for things like "insert path here". >>> >>> 27) The SPARQL behind these templates cannot be reused in other SPARQL >>> queries, unlike sh:NodeValidationFunctions. >>> >>> >> > >
Received on Friday, 11 March 2016 17:41:35 UTC