- From: Holger Knublauch <holger@topquadrant.com>
- Date: Wed, 15 Mar 2017 13:25:29 +1000
- To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
On 14/03/2017 22:49, Peter F. Patel-Schneider wrote: > I was too quick in agreeing that term equality is adequate for testing the > boolean values that show up in validation reports. If a data graph does not > validate against a shapes graph the value for sh:conforms can be any boolean > value except "true"^^xsd:boolean. So it can be "false"^^xsd:boolean or > "0"^^xsd:boolean. This causes problems for graph isomorphism. > > The value can also be "1"^^xsd:boolean if the data graph does not conform to > the shapes graph, which seems odd. The value can also be "a"^^xsd:boolean, > which also seems odd. This appears to be a problem not with testing but with > the definition of validation reports. I have tightened the definition of sh:conforms to be always either true or false. This also resolves this part of the graph comparison problem. > > Similar problems occur in other places. For example, a value of > "1"^^xsd:boolean for sh:uniqueLang or sh:qualifiedValueShapesDisjoint or > sh:closed does not enable the feature. It is IMHO unfortunate that RDF even allows 0 or 1 for booleans. Luckily this fact is barely known and hardly ever used in practice (although I confess I did bump into it recently with a JavaScript library, probably the only time ever in the last 10 years). Since I don't want to unnecessarily complicate the language and add to implementation costs, I believe the definitions of sh:uniqueLang and sh:qualifiedValuesShapesDisjoint are OK as they are right now. IMHO we shouldn't encourage the use of "1"^^xsd:boolean further. If anyone has strong feelings otherwise, please file a ticket to bring it in front of the WG. Holger > > The SHACL document still needs a close critical examination to detect these > kinds of problems. > > peter > > > On 03/14/2017 04:58 AM, Peter F. Patel-Schneider wrote: >> On 03/13/2017 10:39 PM, Holger Knublauch wrote: >>> >>> On 13/03/2017 21:22, Peter F. Patel-Schneider wrote: >>>> On 03/12/2017 04:48 PM, Holger Knublauch wrote: >>>>> The test suit document is work in progress and I have basically just started >>>>> to take a deeper look. I welcome any help on this and really don't want to >>>>> "own" this document. >>>>> >>>>> On 12/03/2017 6:02, Peter F. Patel-Schneider wrote: >>>>>> It's going to be hard. It's not possible to just remove the parts of the >>>>>> validation report that can vary because some of these parts have >>>>>> conditions on >>>>>> them. For example, removing type and subclass triples will prevent checking >>>>>> the SHACL instance requirements. >>>>> Ok, the fact that reports allow for instances of subclasses of >>>>> sh:ValidationReport and sh:ValidationResult indeed requires an extra >>>>> pre-processing step. I have now added this step, normalizing these to their >>>>> direct rdf:type. >>>>> >>>>>> There is also the problem that there are >>>>>> different RDF literals with the same value. >>>>> Why is this a problem? I believe RDF ismorphism relies on term equality: >>>>> >>>>> https://www.w3.org/TR/rdf11-concepts/#graph-isomorphism >>>>> https://www.w3.org/TR/rdf11-concepts/#dfn-literal-term-equality >>>> Just following the second link here shows that RDF term equality looks at >>>> the syntactic form of RDF literals, not their value. There is even a very >>>> illustrative example provided. >>>> >>>> https://www.w3.org/TR/rdf11-concepts/#dfn-literal-term-equality >>>> ******************* >>>> Literal term equality: Two literals are term-equal (the same RDF literal) if >>>> and only if the two lexical forms, the two datatype IRIs, and the two >>>> language tags (if any) compare equal, character by character. Thus, two >>>> literals can have the same value without being the same RDF term. For >>>> example: >>>> "1"^^xs:integer >>>> "01"^^xs:integer >>>> denote the same value, but are not the same literal RDF terms and are not >>>> term-equal because their lexical form differs. >>>> ******************* >>> That's understood, but I believe term equality is what we want, not value >>> equality. AFAICS all of the properties in the results vocabulary (e.g. >>> sh:focusNode, sh:resultPath, sh:sourceShape) can only have precisely matching >>> values. The only times where they can be literals such as "1" vs "01" is if >>> they point at values from the data graph via sh:value, and in those cases we >>> are doing term equality too. So I don't see the problem that you seem to see >>> right now. >> Yes, you are right. The SHACL document defined true in red as a particular >> RDF term and uses true in red throughout where it talks about validation >> results. My fault for not looking closely enough and assuming that true in >> SHACL validation reports could be any RDF term whose RDF value is true. >> >>>>>> Probably the biggest problem is >>>>>> that the number of values for sh:result can vary between SHACL Core >>>>>> implementations for the same validation. >>>>> This is not the intention of the spec. The spec states that each validator >>>>> must have a mode in which it always produces all results. >>>>> >>>>> SHACL-compliant processors /must/ be capable of returning a validation report >>>>> with all required validation results >>>>> <http://w3c.github.io/data-shapes/shacl/#dfn-validation-results>described in >>>>> this specification. >>>> Consider the validating the data graph >>>> ex:i ex:p ex:j ; ex:q ex:j . >>>> ex:j ex:p ex:j . >>>> against the shapes graph >>>> ex:s1 rdf:type sh:PropertyShape ; >>>> sh:targetNode ex:i ; >>>> sh:property [ sh:path ex:p ; sh:property ex:s2 ] ; >>>> sh:property [ sh:path ex:q ; sh:property ex:s2 ] . >>>> ex:s2 sh:path ex:p ; sh:class ex:C . >>>> It is reasonable and acceptable to have one top-level validation result >>>> here. It seems to me that there is an argument that it is also reasonable >>>> and acceptable to have two top-level validation results here. >>> The intention, and what I believe the current spec states, is that two results >>> must be produced in this case - the intro to section 4 states that it always >>> has to produce new result nodes and these cannot be shared. Also the >>> validation is defined per-focus-node and not for a group of focus nodes (which >>> may indeed cause duplicate value nodes to be swallowed up). So if ex:s2 for >>> ex:j is reached by two property shapes, it will produce one result for each >>> original focus node. >> Not so. Even with the wording about producing new results a SPARQL >> implementation is free to optimize its performance. For example, the >> implementation may decide to cache results of validation. This can result >> in fewer validations being performed. The results of these validations can >> then be used multiple times and then show up in the validation report. >> >>> Holger >> peter >>
Received on Wednesday, 15 March 2017 03:26:04 UTC