- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Tue, 14 Mar 2017 05:49:14 -0700
- To: Holger Knublauch <holger@topquadrant.com>, "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
I was too quick in agreeing that term equality is adequate for testing the boolean values that show up in validation reports. If a data graph does not validate against a shapes graph the value for sh:conforms can be any boolean value except "true"^^xsd:boolean. So it can be "false"^^xsd:boolean or "0"^^xsd:boolean. This causes problems for graph isomorphism. The value can also be "1"^^xsd:boolean if the data graph does not conform to the shapes graph, which seems odd. The value can also be "a"^^xsd:boolean, which also seems odd. This appears to be a problem not with testing but with the definition of validation reports. Similar problems occur in other places. For example, a value of "1"^^xsd:boolean for sh:uniqueLang or sh:qualifiedValueShapesDisjoint or sh:closed does not enable the feature. The SHACL document still needs a close critical examination to detect these kinds of problems. peter On 03/14/2017 04:58 AM, Peter F. Patel-Schneider wrote: > On 03/13/2017 10:39 PM, Holger Knublauch wrote: >> >> >> On 13/03/2017 21:22, Peter F. Patel-Schneider wrote: >>> On 03/12/2017 04:48 PM, Holger Knublauch wrote: >>>> The test suit document is work in progress and I have basically just started >>>> to take a deeper look. I welcome any help on this and really don't want to >>>> "own" this document. >>>> >>>> On 12/03/2017 6:02, Peter F. Patel-Schneider wrote: >>>>> It's going to be hard. It's not possible to just remove the parts of the >>>>> validation report that can vary because some of these parts have >>>>> conditions on >>>>> them. For example, removing type and subclass triples will prevent checking >>>>> the SHACL instance requirements. >>>> Ok, the fact that reports allow for instances of subclasses of >>>> sh:ValidationReport and sh:ValidationResult indeed requires an extra >>>> pre-processing step. I have now added this step, normalizing these to their >>>> direct rdf:type. >>>> >>>>> There is also the problem that there are >>>>> different RDF literals with the same value. >>>> Why is this a problem? I believe RDF ismorphism relies on term equality: >>>> >>>> https://www.w3.org/TR/rdf11-concepts/#graph-isomorphism >>>> https://www.w3.org/TR/rdf11-concepts/#dfn-literal-term-equality >>> Just following the second link here shows that RDF term equality looks at >>> the syntactic form of RDF literals, not their value. There is even a very >>> illustrative example provided. >>> >>> https://www.w3.org/TR/rdf11-concepts/#dfn-literal-term-equality >>> ******************* >>> Literal term equality: Two literals are term-equal (the same RDF literal) if >>> and only if the two lexical forms, the two datatype IRIs, and the two >>> language tags (if any) compare equal, character by character. Thus, two >>> literals can have the same value without being the same RDF term. For >>> example: >>> "1"^^xs:integer >>> "01"^^xs:integer >>> denote the same value, but are not the same literal RDF terms and are not >>> term-equal because their lexical form differs. >>> ******************* >> >> That's understood, but I believe term equality is what we want, not value >> equality. AFAICS all of the properties in the results vocabulary (e.g. >> sh:focusNode, sh:resultPath, sh:sourceShape) can only have precisely matching >> values. The only times where they can be literals such as "1" vs "01" is if >> they point at values from the data graph via sh:value, and in those cases we >> are doing term equality too. So I don't see the problem that you seem to see >> right now. > > Yes, you are right. The SHACL document defined true in red as a particular > RDF term and uses true in red throughout where it talks about validation > results. My fault for not looking closely enough and assuming that true in > SHACL validation reports could be any RDF term whose RDF value is true. > >>>>> Probably the biggest problem is >>>>> that the number of values for sh:result can vary between SHACL Core >>>>> implementations for the same validation. >>>> This is not the intention of the spec. The spec states that each validator >>>> must have a mode in which it always produces all results. >>>> >>>> SHACL-compliant processors /must/ be capable of returning a validation report >>>> with all required validation results >>>> <http://w3c.github.io/data-shapes/shacl/#dfn-validation-results>described in >>>> this specification. >>> Consider the validating the data graph >>> ex:i ex:p ex:j ; ex:q ex:j . >>> ex:j ex:p ex:j . >>> against the shapes graph >>> ex:s1 rdf:type sh:PropertyShape ; >>> sh:targetNode ex:i ; >>> sh:property [ sh:path ex:p ; sh:property ex:s2 ] ; >>> sh:property [ sh:path ex:q ; sh:property ex:s2 ] . >>> ex:s2 sh:path ex:p ; sh:class ex:C . >>> It is reasonable and acceptable to have one top-level validation result >>> here. It seems to me that there is an argument that it is also reasonable >>> and acceptable to have two top-level validation results here. >> >> The intention, and what I believe the current spec states, is that two results >> must be produced in this case - the intro to section 4 states that it always >> has to produce new result nodes and these cannot be shared. Also the >> validation is defined per-focus-node and not for a group of focus nodes (which >> may indeed cause duplicate value nodes to be swallowed up). So if ex:s2 for >> ex:j is reached by two property shapes, it will produce one result for each >> original focus node. > > Not so. Even with the wording about producing new results a SPARQL > implementation is free to optimize its performance. For example, the > implementation may decide to cache results of validation. This can result > in fewer validations being performed. The results of these validations can > then be used multiple times and then show up in the validation report. > >> Holger > > peter >
Received on Tuesday, 14 March 2017 12:49:49 UTC