- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Wed, 19 Nov 2014 11:05:19 +0200
- To: "Eric Prud'hommeaux" <eric@w3.org>
- Cc: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg <public-data-shapes-wg@w3.org>
- Message-ID: <CA+u4+a2x2TThRv0bxM=WYaUMHVGi2ui3buc+eiyo8Oh5Qkdp3w@mail.gmail.com>
On Wed, Nov 19, 2014 at 1:47 AM, Eric Prud'hommeaux <eric@w3.org> wrote: > * Holger Knublauch <holger@topquadrant.com> [2014-11-06 09:38+1000] > > I think it's encouraging to read suggestions on how we could merge > > ideas from the various proposals, e.g. extend SPIN to make the > > scenario below easier to represent. This is always a possibility. > > > > Thanks for providing a specific example, which makes our discussion > > more focused. I do believe that the example below can be expressed > > with the existing SPIN spec via something like > > > > :Issue > > spin:constraint [ > > a sp:Ask ; > > sp:text """ > > # The assignee must have an mbox > > ASK { > > ?this :assignedTo ?assignee . > > FILTER NOT EXIST { ?assignee foaf:mbox ?any } > > multiplied out for cardinality over :submittedBy:{given,family}, > status=unassigned | (status=assigned && > assignedTo/{givenName,familyName,mbox}), etc. gave me the 107 lines of > SPARQL at the bottom of this message. > > I'd argue to that. Since we define multiple Shapes, why do we have to generate a single huge SPARQL query? In RDFUnit we take a different approach and instead of a single SPARQL query, we decompose the RS constraints to multiple ones. We call it TestCase and in our case it's not a valid SPARQL query in definition like SPIN but translates to one on execution. See below for details > > > } > > """ > > ] > > > > Basically this associates the constraint to the starting point and > > uses a path to walk into the Person. > > > > The question then becomes how acceptable is that solution compared > > to having to introduce a special mechanism and change the whole > > execution engine, introduce the notion of starting nodes etc. I > > would say the case you describe is quite rare and therefore people > > should be able to live with the little inconvenience. > > Any time you see a restriction in OWL you have an example of a > contextual constraint. OWL literature pretty much indoctrinates for > constraining general predicates for us in particular classes, e.g. > the pizza tutorial's :hasTopping. I've used many nested property > restrictions in the projects that I've worked on. > > > > The benefit of > > the SPIN solution above is consistency, and users just need to > > understand the simple principle of "object-oriented attachment" vs > > context-sensitive execution and regular expressions. > > > > Furthermore, if the above is a recurring pattern, then it could be > > generalized into a SPIN template, producing a definition such as > > > > :Issue > > spin:constraint [ > > a ex:RequiredPropertyInContext ; > > arg:contextProperty :assignedTo ; > > arg:requiredProperty foaf:mbox > > ] > > Note that Resource Shapes does that but keeps the nested constraints > in a separate shape: > > [[ > <http://ex.example/x#NewIssueShape> a rs:ResourceShape ; > ... > rs:property [ > rs:name "submittedBy" ; > rs:propertyDefinition :submittedBy ; > rs:valueShape x:SubmitterShape ; > rs:occurs rs:Exactly-one ; > ] ; > ... > > <http://ex.example/x#SubmitterShape> a rs:ResourceShape ; > rs:property [ > rs:name "givenName" ; > rs:propertyDefinition foaf:givenName ; > rs:valueType shex:Literal ; > rs:occurs rs:Exactly-one ; > ] ; > rs:property [ > rs:name "familyName" ; > rs:propertyDefinition foaf:familyName ; > rs:valueType shex:Literal ; > rs:occurs rs:Zero-or-one ; > ] ; > . > ]] > At the moment RDFUnit support only Shapes that define oslc:describes (typed Shapes) but in that case our approach can follow both oslc:range & oslc:valueType on different Shape definitions. Assuming <http://ex.example/x#N <http://ex.example/x#SubmitterShape>ewIssueShape> oslc:describes x:NewIssue <http://ex.example/x#SubmitterShape> oslc:describes x:Submitter we will generate the following tests - :submittedBy must occur once in x:NewIssue - :submittedBy range must be x;Submitter - :givenName must occur once in x:Submitter - :givenName must be a literal - :familyName must occur once in x:Submitter - :familyName must be a literal This has the advantage to provide more granular results to the end user. At the moment, this approach does not work with ShEx's OR ( '|' ) semantics but could be if we nest the tests into a logical graph We all agree that Shapes must translate to SPARQL but just that is not enough. Even a few Shapes could easily produce several hundred lines of SPARQL. This query can easily run on a small in-memory graph but would probably never return on an endpoint with a few million triples. Anchoring Shapes to classes is a means to easily decompose the constraints, not convinced if untyped Shapes can achieve that or can validate a SPARQL Endpoint directly. This is not to say that I am against untyped Shapes but in the end we should make clear the limitations of each choice. Best, Dimtiris > > > Holger > > > > > > On 11/6/2014 4:25, Eric Prud'hommeaux wrote: > > >* Holger Knublauch <holger@topquadrant.com> [2014-11-05 15:35+1000] > > >>On 11/5/2014 15:26, Irene Polikoff wrote: > > >>>>From: Holger Knublauch [mailto:holger@topquadrant.com] > > >>>>Sent: Wednesday, November 05, 2014 12:16 AM > > >>>>To: public-data-shapes-wg@w3.org > > >>>>Subject: Can Shapes always be Classes? > > >>>> > > >>>>I believe there is a fundamental difference in how the various > proposals > > >>>>treat the relationship between resources and their shapes: > > >>>> > > >>>>- In OWL and SPIN, constraints are attached to classes. rdf:type > triples are > > >>>>used to determine which constraints need to be evaluated for a given > > >>>>instance. > > >>>> > > >>>>- In the original Resource Shapes and ShEx, Shapes are stand-alone > entities > > >>>>that may or may not be associated with a class. Other mechanisms than > > >>>>rdf:type are used to point from instances to their shapes. > > >>>> > > >>>>I believe the main motivation for the latter design are the User > Stories > > >>>>S7 and S8: different shapes at different times, and properties can > change as > > >>>>they pass through the workflow. I would like to learn more about > this and > > >>>>have specific examples that we can evaluate. > > >>>> > > >>>>My current assumption is that these scenarios can be expressed via > named > > >>>>graphs, so that different class definitions are used in different > contexts. > > >>>>Which graph to use would be specified in some kind of header > metadata or via > > >>>>a special property (e.g. owl:imports). Alternatively, different > classes > > >>>>could be used, just like different shapes are used depending on the > context. > > >>>>I argue that using rdf:type and RDFS classes is a well-established > mechanism > > >>>>that we should try to build upon. What problems do the proponents of > the > > >>>>decoupling see with those ideas? > > >I think the fundamental issue is whether these are effectively > > >context-sensitive grammars. As currently proposed, SPIN depends on > > >type annotations attached to the data. It would be possible to add a > > >step which creates a premise when validating some node. I believe this > > >would get around all of the issues stemming from requiring fully > > >discriminating types. > > > > > >Use Case: context-sensitive-rooted-issue-interface > > > > > >An LDP service accepts new Issues. A posted issue is expected to have > > >a :name, :status and a :submittedBy. If the status is :assigned, it > > >must have an :assignedTo . It may also have references to other Issues > > >which may or may not be in the system so they are referenced by :name. > > > > > >Sample Data: > > > _:IssueA a :Issue ; > > > :name "funny smell and no lights" ; > > > :status :assigned ; > > > :submittedBy _:Bob ; > > > :assignedTo _:Bob ; > > > :related [ a :Issue ; :name "smoke coming from unit" ], > > > [ a :Issue ; :name "110V capacitor in French unit" ] . > > > > > > _:Bob a foaf:Person ; > > > foaf:givenName "Bob" ; > > > foaf:familyName "Smith" ; > > > foaf:mbox <mailto:bob@example.com> . > > > > > >There are multiple nodes of type :Issue so the client can specify the > > >start node as _:IssueA (e.g. in a header). This makes the posted data > > >a "pointed graph". > > > > > >If the requirements for the :submittedBy and the :assignedTo are > > >different, we have need context-sensitivity. > > > > > > x:NewIssueShape { > > > :name LITERAL, > > > :submittedBy @x:SubmitterShape, > > > (:status (:unassigned :unknown) > > > | :status (:assigned), > > > :assignedTo @x:AssigneeShape), > > > :related { :name LITERAL }* > > > } > > > > > > x:SubmitterShape { > > > foaf:givenName LITERAL, > > > foaf:familyName LITERAL? > > > } > > > > > > x:AssigneeShape { > > > foaf:givenName LITERAL, > > > foaf:familyName LITERAL, > > > foaf:mbox IRI > > > } > > > > > >If we have some OWL like (eliding cardinalities): > > > > > > Class: x:NewIssueShape > > > SubClassOf: > > > :name some rdfs:Literal, > > > :submittedBy some x:SubmitterShape, > > > (:status value :unassigned > > > or > > > (:status value :assigned and :assignedTo some x:AssigneeShape)) > > > :related (:name rdfs:Literal) > > > > > > Class: x:SubmitterShape > > > SubClassOf: > > > foaf:givenName some rdfs:Literal, > > > foaf:familyName some rdfs:Literal > > > > > > Class: x:AssigneeShape > > > SubClassOf: > > > foaf:givenName some rdfs:Literal, > > > foaf:familyName some rdfs:Literal, > > > foaf:mboxName some rdf:Resource > > > > > >, we can validate the data with the premise _:IssueA a x:NewIssueShape. > > >The validation will recursively test the nested constraints. This of > > >course hinges on being able to verify a premise. > > > > > >It seems reasonable to extend SPIN to test premises. It could use the > > >same idea where instead of an rdfs:range to specify an expected object > > >type, one could use Resource Shapes' rs:valueShape. This would assert > > >the premise that e.g. _:Bob a x:SubmitterShape and then another > > >_:Bob a x:AssigneeShape . > > > > > > > > >>>>I think this is a major design decision that we need to clarify > early. > > >>>>Instead of excluding those scenarios, I would like to accommodate > them > > >>>>without having to introduce completely new mechanisms. > > >>>> > > >>>Holger, > > >>> > > >>>I believe you are saying that there could be two (or more) named > graphs each > > >>>containing different sets of constraints for a particular classes (or > > >>>classes). For example: > > >>> > > >>>Graph A: contains rdf:type statements for a set of classes and > properties. > > >>>Can also contain other RDFS or OWL axioms > > >>> > > >>>Graph B: contains some constraints for classes declared in Graph A > > >>> > > >>>Graph C: contains a different set of constraints for classes declared > in > > >>>Graph A > > >>> > > >>>And so on > > >>> > > >>>A given application can then chose what set of constraints it will be > using > > >>>- Graph B or Graph C. > > >>> > > >>>Is this correct? > > >>Yes sorry I was brief. Let's take an extreme use case, where the > > >>same ex:Instance must fulfill different constraints in scenario A > > >>and B. > > >> > > >> ex:Instance > > >> a ex:Person ; > > >> foaf:firstName "John" . > > >> > > >>Scenario A: Each Person can have any number of first names. > > >> > > >>Scenario B: Each Person must have exactly one first name. > > >> > > >>In scenario A, it would have > > >> > > >> <instance graph> owl:imports <schema A> > > >> > > >>where <schema A> is simply the unconstrained class definition. > > >> > > >> ex:Person a rdfs:Class . > > >> > > >>In scenario B, it would owl:import <schema B> which is > > >> > > >> ex:Person a rdfs:Class ; > > >> constraint foaf:firstName exactly 1 . > > >> > > >>I hope this explains the named graph work-around. > > >> > > >>Holger > > > > > > [[ > PREFIX :<http://ex.example/#> > PREFIX foaf:<http://foaf.example/#> > PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX x:<http://ex.example/x#> > PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> > ASK { > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :name ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :name ?o . FILTER (isLiteral(?o)) > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c0) { > ?http://ex.example/x#NewIssueShape :submittedBy ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :submittedBy ?o . FILTER > ((isIRI(?o) || isBlank(?o))) > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c1) { > { SELECT ?http://ex.example/x#NewIssueShape ? > http://ex.example/x#SubmitterShape { > ?http://ex.example/x#NewIssueShape :submittedBy ? > http://ex.example/x#SubmitterShape . FILTER (true && (isIRI(? > http://ex.example/x#SubmitterShape) || isBlank(? > http://ex.example/x#SubmitterShape))) > } } > { SELECT ?http://ex.example/x#SubmitterShape { > ?http://ex.example/x#SubmitterShape foaf:givenName ?o . > } GROUP BY ?http://ex.example/x#SubmitterShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#SubmitterShape { > ?http://ex.example/x#SubmitterShape foaf:givenName ?o . FILTER > (isLiteral(?o)) > } GROUP BY ?http://ex.example/x#SubmitterShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#SubmitterShape (COUNT(*) AS ? > http://ex.example/x#SubmitterShape_c0) { > ?http://ex.example/x#SubmitterShape foaf:familyName ?o . > } GROUP BY ?http://ex.example/x#SubmitterShape HAVING > (COUNT(*)<=1)} > { SELECT ?http://ex.example/x#SubmitterShape (COUNT(*) AS ? > http://ex.example/x#SubmitterShape_c1) { > ?http://ex.example/x#SubmitterShape foaf:familyName ?o . FILTER > (isLiteral(?o)) > } GROUP BY ?http://ex.example/x#SubmitterShape HAVING > (COUNT(*)<=1)} > FILTER (?http://ex.example/x#SubmitterShape_c0 = ? > http://ex.example/x#SubmitterShape_c1) > } GROUP BY ?http://ex.example/x#NewIssueShape } > FILTER (?http://ex.example/x#NewIssueShape_c0 = ? > http://ex.example/x#NewIssueShape_c1) > OPTIONAL { ?http://ex.example/x#NewIssueShape :submittedBy ? > http://ex.example/x#NewIssueShape_http://ex.example/x#SubmitterShape_ref0 > . FILTER (true && (isIRI(? > http://ex.example/x#NewIssueShape_http://ex.example/x#SubmitterShape_ref0) > || isBlank(? > http://ex.example/x#NewIssueShape_http://ex.example/x#SubmitterShape_ref0))) > } > { SELECT ?http://ex.example/x#NewIssueShape WHERE { > { > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :status ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :status ?o . FILTER ((?o > = :unassigned || ?o = :unknown)) > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > } UNION { > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :status ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :status ?o . FILTER ((?o > = :assigned)) > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c2) { > ?http://ex.example/x#NewIssueShape :assignedTo ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape { > ?http://ex.example/x#NewIssueShape :assignedTo ?o . FILTER > ((isIRI(?o) || isBlank(?o))) > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c3) { > { SELECT ?http://ex.example/x#NewIssueShape ? > http://ex.example/x#AssigneeShape { > ?http://ex.example/x#NewIssueShape :assignedTo ? > http://ex.example/x#AssigneeShape . FILTER (true && (isIRI(? > http://ex.example/x#AssigneeShape) || isBlank(? > http://ex.example/x#AssigneeShape))) > } } > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:givenName ?o . > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:givenName ?o . > FILTER (isLiteral(?o)) > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:familyName ?o . > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:familyName ?o . > FILTER (isLiteral(?o)) > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:mbox ?o . > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > { SELECT ?http://ex.example/x#AssigneeShape { > ?http://ex.example/x#AssigneeShape foaf:mbox ?o . > FILTER (isIRI(?o)) > } GROUP BY ?http://ex.example/x#AssigneeShape HAVING > (COUNT(*)=1)} > } GROUP BY ?http://ex.example/x#NewIssueShape } > FILTER (?http://ex.example/x#NewIssueShape_c2 = ? > http://ex.example/x#NewIssueShape_c3) > OPTIONAL { ?http://ex.example/x#NewIssueShape :assignedTo ? > http://ex.example/x#NewIssueShape_http://ex.example/x#AssigneeShape_ref0 > . FILTER (true && (isIRI(? > http://ex.example/x#NewIssueShape_http://ex.example/x#AssigneeShape_ref0) > || isBlank(? > http://ex.example/x#NewIssueShape_http://ex.example/x#AssigneeShape_ref0))) > } > } > } GROUP BY ?http://ex.example/x#NewIssueShape HAVING (COUNT(*) = 1)} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c4) { > ?http://ex.example/x#NewIssueShape :related ?o . > } GROUP BY ?http://ex.example/x#NewIssueShape} > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c5) { > ?http://ex.example/x#NewIssueShape :related ?o . FILTER ((isIRI(?o) > || isBlank(?o))) > } GROUP BY ?http://ex.example/x#NewIssueShape} > FILTER (?http://ex.example/x#NewIssueShape_c4 = ? > http://ex.example/x#NewIssueShape_c5) > { SELECT ?http://ex.example/x#NewIssueShape (COUNT(*) AS ? > http://ex.example/x#NewIssueShape_c6) { > { SELECT ?http://ex.example/x#NewIssueShape ?4 { > ?http://ex.example/x#NewIssueShape :related ?4 . FILTER (true > && (isIRI(?4) || isBlank(?4))) > } } > { SELECT ?4 { > ?4 :name ?o . > } GROUP BY ?4 HAVING (COUNT(*)=1)} > { SELECT ?4 { > ?4 :name ?o . FILTER (isLiteral(?o)) > } GROUP BY ?4 HAVING (COUNT(*)=1)} > } GROUP BY ?http://ex.example/x#NewIssueShape } > FILTER (?http://ex.example/x#NewIssueShape_c4 = ? > http://ex.example/x#NewIssueShape_c6) > OPTIONAL { ?http://ex.example/x#NewIssueShape :related ? > http://ex.example/x#NewIssueShape_4_ref0 . FILTER (true && (isIRI(? > http://ex.example/x#NewIssueShape_4_ref0) || isBlank(? > http://ex.example/x#NewIssueShape_4_ref0))) } > } > ]] > > -- > -ericP > > office: +1.617.599.3509 > mobile: +33.6.80.80.35.59 > > (eric@w3.org) > Feel free to forward this message to any list for any purpose other than > email address distribution. > > There are subtle nuances encoded in font variation and clever layout > which can only be seen by printing this message on high-clay paper. > > -- Dimitris Kontokostas Department of Computer Science, University of Leipzig Research Group: http://aksw.org Homepage:http://aksw.org/DimitrisKontokostas
Received on Wednesday, 19 November 2014 09:06:18 UTC