- From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
- Date: Mon, 7 Feb 2022 14:07:27 +0100
- To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Cc: "public-rdf-star@w3.org" <public-rdf-star@w3.org>
- Message-ID: <6a3e0163-00d3-25ce-c6be-39425d513d17@ercim.eu>
Hi Antoine, yes, this blog post is meant to provide guidance on how to use (or not use) RDF-star. no, it is not meant to exhaust the topic of modelling provenance in RDF-star. More about this below, but granted, the title of the blog post was misleading as to its intent. I changed it to "RDF-star patterns for provenance", hopefully making it clearer. The main point was to illustrate that annotating edges directly, although attractive, does not always lead to a correct/satisfactory modeling. Provenance is mostly used here as a setting for illustrating this point. Actually, the initial plan to illustrate these patterns with *different* use-cases, but the post ended up big enough once the provenance examples were described. So we decided to scale it down to this particular use-case, and describe others in future posts. Also for the sake of brevity and focus, we decided to leave aside other (nonetheless interesting) questions, such as - whether the properties in the examples are "transparency-enabling" or not (the distinction btw "Lex said 'Superman can fly'" vs. "Lex said that Superman can fly" -- see https://www.w3.org/2021/12/rdf-star.html#selective-ref-transparency) - what existing vocabulary can be used instead of the toy vocabulary used in the examples (and yes, your proposal based on PROV is definietly a path worth exploring) This also will hopefully be described in future posts. best On 04/02/2022 11:22, Antoine Zimmermann wrote: > Pierre-Antoine, > > > Le 27/01/2022 à 17:43, Pierre-Antoine Champin a écrit : >> Hi Antoine, >> >> jum to the very end of your message for my reply. > > Go down to have my answer to your reply. > >> >> On 27/01/2022 10:30, Antoine Zimmermann wrote: >>> Pierre-Antoine, >>> >>> >>> I think the description of the intended meaning of the RDF-star >>> graphs given in this post are not aligned with the formal meaning >>> given in the spec. Or, at least, that the presentation is misleading >>> the reader into misusing quoted triples for provenance (or for >>> anything, for that matter). >>> >>> Bare with me for a moment, as I have to place my arguments one at a >>> time before concluding. >>> >>> You give this example: >>> >>> """ >>> PREFIX : <http://www.example.org/> >>> >>> :employee38 :familyName "Smith" . >>> << :employee38 :jobTitle "Assistant Designer" >> :accordingTo >>> :employee22 . >>> """ >>> >>> and say: "The intended meaning of this small RDF-star graph is: >>> “employee #38 is named Smith, and employee #22 claims that employee >>> #38 is an assistant designer”." >>> >>> The problem here is that a reader may conclude that, if they want to >>> say “employee #38 is named Smith, and employee #22 claims that >>> employee #38 is an assistant designer”, among other things, they can >>> just take your example and integrate it in their data set. This may >>> not be sensible, depending on what they want to say about the claim, >>> and most importantly, what they *don't* want to say about it. >>> >>> The issue is that, by saying "The intended meaning of this RDF-star >>> graph is [explanation]", you actually want to say "As part of the >>> intended meaning of this RDF-star graph, we have that >>> [explanation]". But this is not the full meaning of the RDF-star >>> graph. Indeed, due to the RDF-star semantics, there is additional >>> meaning imposed by the spec itself. >>> >>> The spec says that this RDF-star graph also carries the meaning that >>> the claim is related to the URIs ":employee38" and ":jobTitle" in a >>> specific way, and related to the string literal """"Assistant >>> Designer"^xsd:string""". If one merely wants to say that "employee >>> 22 claims that employee 38 is an assistant designer", one perhaps >>> *does not* want to relate this claim to the URI ":jobTitle". >>> >>> When you define the intended meaning, you can say whatever you like >>> about what the URIs denote, as long as they are not among the >>> standard URIs of the spec. So you can say, for instance, that >>> ":accordingTo" denotes the relation that exists between a claim and >>> the people who make the claim. But you cannot define the intended >>> meaning of a structure of the language, like quoted triples, which >>> is defined by the spec. >>> >>> As an analogous example, consider standard RDF and the following >>> RDF-graph: >>> >>> """ >>> :claim1 :accordingTo "Pierre-Antoine". >>> """ >>> >>> You can say that ":accordingTo" is intended to mean the relation >>> between a claim and a person, but you cannot say that the intended >>> meaning of this triple is that ":claim1" is claimed by a person >>> named "Pierre-Antoine". Given the intention that ":accordingTo" >>> relates a claim to a person, this graph is implying that the >>> character string "Pierre-Antoine" is a person, which is absurd.[*] >>> >>> With such examples and explanations in your post, you are suggesting >>> the audience that they can use your RDF-star examples as templates >>> for the intended meanings you present. So you are telling the >>> audience that they can use RDF-star graphs in ways that clash with >>> the formal semantics. In other words, you are openly showing that >>> the RDF-star semantics can be safely ignored. >>> >>> As a consequence, I do not see how there could be, and why there >>> should be, any support for the current formal semantics of the spec. >>> Either throw it to the bin (allowing anyone to form their own >>> interpretations of what quoted triples entail) or revise it such >>> that it matches the intended meanings suggested by its authors. >>> >>> >>> >>> [*] of course, one could interpret ":accordingTo" as: "the relation >>> between a claim and the first name of a person that makes the claim". >> >> Yes, that's exactly what I was about to argue. I would even go >> further, and argue that many (all?) properties can be seen, from some >> perspective, as the kind of "shortcut" that you describe above. >> Consider foaf:givenName: >> >> :az foaf:givenName "Antoine". >> >> While it is convenient to conflate your given name the sequence of >> characters used to write it, this design prevents me from expressing >> some things, like for example the fact that the given name `Antoine` >> is derived from the latin name `Antonius`. > > The difference here is that most people, I believe, would accept that > a name can be a character string (and vice versa). If I consider the > character string 's', 't', 'a', 'r', I'm happy to say that it is a > word in English. Likewise, I'm happy to say that 'A', 'n', 't', 'o', > 'i', 'n', 'e', is a name of latin origin. We identify names and > character strings all the time, and it is fine. If you are working in > the field of lexicography and philology, you may want to identify > words, word representations, word senses, etc. with individual URIs, > but I'd say it is beside the point. > >> >> The same goes for properties that apply to quoted triples, in my >> opinion. >> >>> Similarly, one could interpret ":accordingTo" as "the relation >>> between a claim that's attached to certain terms in subject, >>> predicate, and object positions, and a person who makes a claim with >>> these terms". >>> But presenting the blog post in this way would ruin the >>> attractiveness of RDF-star very much. >> >> Could you develop why? > > There are two things I'd like to develop: the first one is the way the > meaning of the quoted triples is presented in the blog post; the > second is the way provenance is supposedly modelled in this blog post. > > Concerning the meaning of RDF-star triples like: > > << :emp38 :jobTitle "Assistant Designer" >> :accordingTo :emp22 . > > we can make an analogy. Suppose we have the following sentence: > > """ > << Clark Kent is the same person as Superman >> said Lex. > """ > > Describing the meaning of this sentence would go like this: > > "This sentence means that Lex used the words in between the quotes, in > this order." > > It would be misleading to describe it like: > > "This sentence means that Lex claims that Clark Kent and Superman are > just one person." > > Of course, the sentence *implies* such a claim, but it is not the full > meaning of it. Someone who's not familiar with quotes may understand > that this is equivalent to: > > """ > << Superman is the same person as Clark Kent >> said Lex. > """ > > because this equally implies that Lex claims that Clark Kent and > Superman are just one person. > > The distinction may be subtle, but in the case of this blog post, you > are not merely explaining what some data out there is about. You are > telling people how to use RDF-star for provenance, with your RDF-star > spec editor hat on. I regard your post as advocating good (best?) > practices. > > RDF-star quoted triples are a lot like quotes in sentences, they refer > to specific RDF terms, not to mere "claims". > > > The second point is about provenance. Provenance is an important topic > in computer science, data management, and even before the existence of > computers, provenance was a thing for historical documents and pieces > of art. It's important enough to have its own field of study, its > models and theories, its tools, its practices. > There is a standard for provenance specifically made to be used in RDF > data. You could easily reuse the PROV model and the PROV-O ontology, > which would make your examples not only more recommendable, but in > fact literally *recommended* by the W3C. > > The way I would write the examples you describe is the following: > Employee 22 claims that Employee 38 is an assistant designer. This is > the fact we want to model. So let us have the URIs :emp22, :emp38, and > :claim1 denote, respectively, Employee 22, Employee 38 and the claim > made by Employee 22. This claim can be encoded as an RDF triple in > this way: > > :emp38 :jobTitle "Assistant Designer" . > > where :jobTitle denotes the relation between a person and a human > readable name of the job, to be encoded as a character string. Then I > can say, using the provenance model: > > << :emp38 :jobTitle "Assistant Designer" >> prov:wasDerivedFrom :claim1 . > :claim1 prov:wasAttributedTo :emp22 . > > This not only fits the definitions of the terms prov:wasDerivedFrom, > prov:wasAttributedTo, and :jobTitle, it also strictly fits with the > formal semantics in the RDF-star spec, and finally uses a well > established model of provenance. > > Now, again, you could have a property that is equivalent to the > composition of prov:wasDerivedFrom and prov:wasAttributedTo, and call > it ":accordingTo". But you would have to describe it as such. In your > post, nothing says that :accordingTo is intended to mean a triple > derived from a claim attributed to a person. Also, the name > :accordingTo is misleading, as it does not suggest that it is about a > triple of RDF terms. > > If, instead, you had described :accordingTo in a precise way that > agrees with the semantics, it would have led to a more complex and > confusing explanation. If, as I believe, one of the aims of the blog > post is to point the audience of RDF-star to the simplicity of the > model, having a complex explanation is detrimental to the > attractiveness of RDF-star. > > In fact, it is not even very clear to me what ought to be the intended > meaning of ":accordingTo" if it was to be compliant with the formal > semantics. Is it, as I suggest with the example above, that :emp22 > made a claim, and the claim is encoded as the triple <<:emp38 > :jobTitle "Assistant Designer" .>>? Or is it that :emp22 > used/generated the triple somehow, regardless of whether they actually > believe or claim the underlying statement? > > > If I had to express provenance of RDF data using RDF-star, I would use > the PROV model as I did above. However, if I merely wanted to say > "Employee 22 claims that Employee 38's job title is 'Assistant > Designer'", I would rather use something like: > > @prefix s: <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> . > @prefix p: <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> . > @prefix o: <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> . > @prefix : <http://example.com/> . > [s: :emp38; p: :jobTitle; :o "Assistant Designer"] :accordingTo :emp22 > > which makes the claim unrelated to the URIs used, or any syntax for > that matter. > > > --AZ > > > PS: "quoted triples" in RDF-star are very much like triples that are > quoted, but not exactly. You cannot properly quote triples with blank > nodes. This has strange consequences. > >> >>> >>> >>> >>> Best, >>> --AZ >>> >>> >>> >>> >>> >>> Le 26/01/2022 à 21:34, Pierre-Antoine Champin a écrit : >>>> Dear all, >>>> >>>> following a discussion during our two last calls, I published a >>>> post about "Provenance in RDF-star": >>>> >>>> https://www.w3.org/community/rdf-dev/2022/01/26/provenance-in-rdf-star/ >>>> >>>> >>>> quoting the intro: >>>> >>>> > In this post, we present some lessons learned by the group >>>> through discussions and exchanges. This is meant to give some >>>> insight about the rationale behind RDF-star, and some guidelines >>>> about how to best use it for modeling provenance data. >>>> >>>> Many thanks to all the participants of the RDF-star group for their >>>> reviews and feedback on this post. >>>> >>>> pa >>>> >>> >>> > >
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Monday, 7 February 2022 13:07:31 UTC