- From: Fabio Vitali <fabio.vitali@unibo.it>
- Date: Fri, 14 Jan 2022 07:39:14 +0000
- To: Pete Rivett <pete.rivett@agnos.ai>
- CC: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>, Anthony Moretti <anthony.moretti@gmail.com>, "public-rdf-star@w3.org" <public-rdf-star@w3.org>
Dear Pete, > I really think it's premature for rdf-star to embody anything like this. I think we should start with a best practice note as suggested (even that will I think be hard enough to reach consensus on), then after sufficient demonstrated success with applying it for real, we could consider standardizing a specific set of predicates in a separate schema. You are totally right and I was not clear enough on this point. I do NOT want to suggest that the Note should standardize a specific temporal/geographical model and a specific set of predicates. I simply want to suggest that the Note describes a PATTERN, alternative to n-ary relationships, for representing statements that are not absolutely true, but are variously constrained, e.g. by temporal or geographical conditions. This would work as follows: Suppose someone comes and says: «I have a huge collection of plain triples, and I now recognize that some of them are temporally limited, and I would like to update my dataset and applications to introduce temporal awareness.» We can answer: «You have two choices in the matter: EITHER you replace each of your temporally limited triples with a n-ary relationship, and attach temporal annotations to the relationship. This implies creating new entities, one for each triple that is temporally limited, replacing your triple with a new set of at least three new triples with the new entity as subject, loosing all your original predicates, and attach the temporal annotation to this new entity. By the way, you'll find that some new entities are better designed as intervals, and others as pairs of boundary events. There are no guidelines on when to use one or the other, and you'll have to rely on your own sensibility. In addition, you'll have to completely rewrite from the ground up all your queries and your presentation software to make use of a completely different representation of facts. OR you place each of your temporally limited triples inside an rdf-star quoted triple and attach the temporal annotation to it. In addition, you'll have to update all your queries and presentation software using this new sparql-star language and recognize that some of the triples are plain triples as before, and some are quoted. No other intervention is required. » -- The same could be said for jurisdictions and certainty: Suppose someone comes and says: «I am merging my dataset about my golf club into the national register of golf clubs. Many triples end up being pretty much similar (e.g., :boardOfDirector :member :JohnSmith), and I cannot distinguish any more which of them belong to my club of to someone else's club.» We can answer: «You have two choices in the matter: EITHER you replace each of your own triples with a n-ary relationship, and attach an association annotation to the relationship. This implies creating new entities, one for each triple that is specific to your club, replacing your triple with a new set of at least three new triples with the new entity as subject, loosing all your original predicates, and attach the association annotation to this new entity. In addition, you'll have to completely rewrite from the ground up all your queries and your presentation software to make use of a completely different representation of facts. OR you place each of your triples inside an rdf-star quoted triple and attach the association annotation to it. For instance: << :boardOfDirector :member :JohnSmith >> :for :PodunkGolfClub . In addition, you'll have to update all your queries and presentation software using this new sparql-star language and recognize that some of the triples are plain triples as before, and some are quoted. No other intervention is required. » I do not think we should mandate the actual predicates, just the pattern: << s p o >> :constrainedBy :constraint . I believe the rdf-star approach is simple and fully appropriate in many different contexts. Fabio > Which also invites the question "what would a schema for rdf-star annotation properties look like, and how could you specify the (required/permitted) use of specific annotation properties with specific regular properties?". > > BTW nary relationships need not need be as complex as your examples. Simpler alternatives: > _:item1 a :temporaryLocation; > :affects :MonaLisa; > :location :Florence; > :hasPeriod [ > :start "1506"^^xsd:Year; > :end "1517"^^xsd:Year; > ] . > > _item1 a :USPresidency [ > :holder :RichardNixon; > :hasPeriod [ > :start "1969-01-20"^^xsd:dateTime ; > :end "1974-08-09"^^xsd:dateTime. > ] > > Cheers, > Pete > > On Thu, 13 Jan 2022 at 09:48, Fabio Vitali <fabio.vitali@unibo.it> wrote: > Hi! > > > On 13 Jan 2022, at 17:04, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote: > > > > Hi Anthony, > > > > you wrote > > > > > that the temporal validity of any statement is implicitly lower-bounded by the existence of the things that it talks about, so technically the birth-date example is only valid after the birth date of the person > > > > Ok, let's take another example: > > > > :theWebConf2022 a s:Event ; > > s:startDate "2022-04-25"^^xsd:date ; > > s:endDate "2022-04-29"^^xsd:date . > > > > would you consider that those triple will only be valid on 2022-04-25? Or would you argue that this event already exists, even though it has not occurred yet? > > > > Without starting to count angels on pinpoints (wondering if a yet-to-be-born person exists or not), let's be pragmatic: does it make your knowledge base inconsistent in any way to consider that such triples about future events are already valid? I don't think so. > > You are adding two more pairs of terms, "valid / non valid" and "exist / not exist", to an already complex issue. The pairs already in play are: > > 1) true / false (or not-true?) > 2) asserted / not-asserted. > > True / not true attain to the relationship between statements and reality (or at least some notion of reality endorsed by logicians and a few mathematicians). Asserted / not asserted attain to whether we know that the current dataset contains the statement or not. > > [ Valid / not valid attain to correctness in expressing statements (e.g., according to an ontology), and exist /not exist attain to physical or philosophical understanding of reality which makes my mind quiver (Does Mickey Mouse exist?). ] > > I understand that here is a traditional, albeit vague, connection in this community between asserted and true, which I respect and uphold. But whatever is the contrary of true, I do not think there should be a similar connection between non-asserted and false (or not true). > > Non-asserted triples can be absolutely true (<< :theWebConf dc:subject :webTechnologies >>), absolutely false (<< :theWebConf :frontFor :mossadRecruitment >> ), and conditionally true (<< :theWebConf :rating :FiveStars >> ), depending on a lot of factors (time, location, provenance, confidence, etc.), and since rdf-star allows us to represent triples without asserting them, we can use it to express facts about non-asserted triples without worrying about their actual truth: > > << :theWebConf dc:subject :webTechnologies >> :accordingTo :wikipedia. > << :theWebConf :frontFor :mossadRecruitment >> :accordingTo :someMadman. > << :theWebConf :rating :FiveStars >> :accordingTo :FabioVitali. > > These triples are all asserted (and true (and valid!)) regardless of the truth value of their quoted triples. This is exactly what makes rdf-star very interesting to me. > > Now, using :theWebConf as in your example is somewhat misleading: you are using an Event, which is an abstract concept of something whose main characteristic is being temporally and geographically constrained, and then you ask if there are other temporal constraints associated to it. No, no, probably not. But you put yourself in an easy situation. > > Let's try with entities which are not events: say, a physical object, a role, a relationship: > > << :monaLisa :location :Florence >> > << :USA :president :RichardNixon >> > << :MickeyMouse inLoveWith :MinnieMouse >> > > All these triples are NOT absolutely true, and at the same time they are NOT absolutely false, either. > > Using rdf-star, we can create absolutely-true statements out of these non-absolutely-true triples: > > << :monaLisa :location :Florence >> :after "1506"^^xsd:Year; :before "1517"^^xsd:Year . > << :USA :president :RichardNixon >> :start "1969"^^xsd:Year; :end "1974"^^xsd:Year . > << :MickeyMouse inLoveWith :MinnieMouse >> :accordingTo :WaltDisney . > > These are trivial rdf-star representations of (simple) anthony-statements (syntax aside). I fail to see a downside to this. > > The opposite, to adopt "dead-simple statements" seems much worse to me: adopting n-ary relationships and events and states and opinions seems SO MUCH MORE COMPLICATED: > > _:item1 a :temporaryLocation; > :affects :monaLisa; > :location :Florence; > :start [ > a :uncertainDate ; > :after "1506"^^xsd:Year; > ] ; > :end [ > a :uncertainDate ; > :before "1517"^^xsd:Year; > ] . > > _:item2 a :temporaryState; > :role :presidency; > :organization :USA; > :holder :RichardNixon; > :startingEvent [ > a :election; > :date "1969-01-20"^^xsd:dateTime. > ]; > :endingEvent [ > a :resignation; > :date "1974-08-09"^^xsd:dateTime. > ]. > > _:item3 a :fictitiousCouple; > :member :MickeyMouse; > :member :MinnieMouse; > :type :Love; > :inventedBy :WaltDisney. > > You may feel safer with n-ary relationships, i.e. with the objectification of relationships into abstract entities, but another way to express this concept is as "reification of triples into blank nodes" which seems to me exactly what rdf-star is about. > > We have rdf-star. Let's use it. > > Ciao > > Fabio > > > > > So I am still not convinced that triples are the right level of granularity for systematically attaching contextual metadata. Following Pat, I prefer to keep rdf-statements dead-simple (1), and model more complex things (like anthony-statements) with a bunch of triples. > > > > pa > > > > (1) even if, arguably, RDF-star makes them a little more complex that they originally were. > > > > On 13/01/2022 03:51, Anthony Moretti wrote: > >> Earlier I wrote: > >> the temporal validity of any statement is implicitly lower-bounded by the existence of the things that it talks about > >> > >> I wouldn't mind some feedback on this, but I think the temporal validity of every statement has an implicit upper bound too: > >> > >> Implicit lower bound: Existence of the things being described. > >> Implicit upper bound: Stated time of assertion, otherwise the present. > >> > >> If that's correct, I can use it to demonstrate optional time and space positions: > >> > >> It's 2010, and Pierre-Antoine sends me a graph. He puts a timestamp on his graph by upper-bounding the temporal validity: > >> > >> { > >> :BarackObama :presidentOf :UnitedStates 2009 _, > >> :JoeBiden :vicePresidentOf :UnitedStates 2009 _, > >> :HillaryClinton :secretaryOfStateOf :UnitedStates 2009 _, > >> } > >> _ 2010 > >> > >> It's now 2022, and I'm working on my own graph: > >> > >> { > >> :JoeBiden :presidentOf :UnitedStates 2021 _, > >> :KamalaHarris :vicePresidentOf :UnitedStates 2021 _, > >> :AntonyBlinken :secretaryOfStateOf :UnitedStates 2021 _, > >> } > >> > >> I trust Pierre-Antoine and remember that he sent me a graph a long time ago. I do the laziest thing possible and import it unmodified as a compound statement. The information is incomplete but the OWA means everything is ok, and the graph is still valid: > >> > >> { > >> { > >> :BarackObama :presidentOf :UnitedStates 2009 _, > >> :JoeBiden :vicePresidentOf :UnitedStates 2009 _, > >> :HillaryClinton :secretaryOfStateOf :UnitedStates 2009 _, > >> } > >> _ 2010, > >> :JoeBiden :presidentOf :UnitedStates 2021 _, > >> :KamalaHarris :vicePresidentOf :UnitedStates 2021 _, > >> :AntonyBlinken :secretaryOfStateOf :UnitedStates 2021 _, > >> } > >> > >> I do automated flattening of the graph. The information is incomplete, but the graph is still valid: > >> > >> { > >> :BarackObama :presidentOf :UnitedStates 2009 2010, > >> :JoeBiden :vicePresidentOf :UnitedStates 2009 2010, > >> :HillaryClinton :secretaryOfStateOf :UnitedStates 2009 2010, > >> :JoeBiden :presidentOf :UnitedStates 2021 _, > >> :KamalaHarris :vicePresidentOf :UnitedStates 2021 _, > >> :AntonyBlinken :secretaryOfStateOf :UnitedStates 2021 _, > >> } > >> > >> I finally get the motivation and I update Pierre Antoine's statements. The information is now up to date and the graph is still valid: > >> > >> { > >> :BarackObama :presidentOf :UnitedStates 2009 2017, > >> :JoeBiden :vicePresidentOf :UnitedStates 2009 2017, > >> :HillaryClinton :secretaryOfStateOf :UnitedStates 2009 2013, > >> :JoeBiden :presidentOf :UnitedStates 2021 _, > >> :KamalaHarris :vicePresidentOf :UnitedStates 2021 _, > >> :AntonyBlinken :secretaryOfStateOf :UnitedStates 2021 _, > >> } > >> > >> I decide to send it back to Pierre-Antoine, and I put a timestamp on my graph: > >> > >> { > >> :BarackObama :presidentOf :UnitedStates 2009 2017, > >> :JoeBiden :vicePresidentOf :UnitedStates 2009 2017, > >> :HillaryClinton :secretaryOfStateOf :UnitedStates 2009 2013, > >> :JoeBiden :presidentOf :UnitedStates 2021 _, > >> :KamalaHarris :vicePresidentOf :UnitedStates 2021 _, > >> :AntonyBlinken :secretaryOfStateOf :UnitedStates 2021 _, > >> } > >> _ 2022 > >> > >> And so it could continue. Spatial validity would be handled similarly. > >> > >> It's very easy to reason about temporal/spatial validity when the approach to statements is unified and optional time and space positions can be used everywhere. > >> > >> Regards > >> Anthony > >> > >> On Wed, Jan 12, 2022 at 12:48 PM Anthony Moretti <anthony.moretti@gmail.com> wrote: > >> Correction, I was a bit sloppy: > >> > >> In both cases I would leave the time and space positions blank anyway, so RDF-as-usual. > >> > >> In the second example the space position would be blank, but not the time positions. I was just trying to agree that yes the second example isn't place-dependent. > >> > >> Regards > >> Anthony > >> > >> On Wed, Jan 12, 2022 at 12:44 PM Anthony Moretti <anthony.moretti@gmail.com> wrote: > >> Hi Pierre-Antoine > >> What is not entirely clear to me is how you see the ideas below interact with RDF-star —or RDF, for that matter... > >> > >> 1) Do you want to modify the core of RDF / RDF-star, replacing their notion of statement by the one you propose here (time+place annotated, complex and/or compound)? > >> > >> 2) Or do you want to explore how your proposed notion of statement could be expressed *on top* of RDF / RDF-star, with no or minimal modification to them? > >> > >> If the answer is 2 (my favorite option, by the way), then the idea is to model anthony-statements using a set of rdf-statements (possibly extended with RDF-star). > >> > >> > >> Ideally: > >> RDF: Time and space positions. > >> RDF-Star: Simple, compound, and complex statements. > >> > >> It would be ideal to put the time and space positions at the RDF level because, as Pat and Fabio seem to agree, some triples are time/space dependent and make no sense without that information. They're not edge cases either, it might seem like that because so far there hasn't been a way to express them, but there are infinitely many just as there are infinitely many that aren't time/space constrained. Also, the order of assertion is important for time/space dependent triples, if anything is to be said about them, additional data or metadata, then the time/space constraints need to be asserted first, and time and space positions ensure that order of assertion. > >> > >> I think it would help the discussion a lot to a) acknowledge that the word "statement" in this discussion is ambiguous, and b) to be as explicit as possible about which kind we are talking about. > >> > >> I'm using the word "statement" as a direct replacement for "sentence", so maybe "sentence" is a better term: > >> > >> sentence: > >> a set of words that is complete in itself, typically containing a subject and predicate, conveying a statement, question, exclamation, or command, and consisting of a main clause and sometimes one or more subordinate clauses. > >> > >> I am uncomfortable with "hard-coding" these 4 dimensions, and only them, in every possible statement. I think that the relevant dimensions depend on the relation itself (e.g., the birth-date of a person is neither time nor place dependent; the president of a country is not place dependent...). And I don't think that any list of contextual dimension can be exhaustive. > >> > >> Especially regarding certainty, there are many ways to model uncertainty (not all of them modelling it with a single value between 0 and 1, by the way). > >> > >> > >> On the first example you gave, my thoughts are that the temporal validity of any statement is implicitly lower-bounded by the existence of the things that it talks about, so technically the birth-date example is only valid after the birth date of the person, the birth date happens to be the object of the statement in this case but the idea would apply to any statement. On the second example, yes I agree its spatial validity is unbound. In both cases I would leave the time and space positions blank anyway, so RDF-as-usual. > >> > >> I'm happy to drop "certainty" for the reasons you stated. I've included it so far because it's another example of where order of assertion becomes important, for it to make sense it needs to be asserted after time and space but before metadata. But yes, let's drop it for now. > >> > >> And yes for sure, no list of contextual dimensions can be exhaustive, but if time and space positions are allowed it ensures those assertions are made first and the whole framework becomes scalable and easier to reason about. > >> > >> Do you have any clear definition, or at least guidelines, to decide whether a piece of information is additional data or metadata? > >> > >> My quick take would be: additional data continues the description, whereas metadata is description of the description. > >> > >> No widespread need, but logically it could continue, descriptions of descriptions of descriptions and so on: > >> > >> Simple statement > >> { Additional data } > >> {| First-order metadata |} > >> {| Second-order metadata |} > >> ... > >> > >> Fabio has a good idea with the note containing examples of good modeling. > >> > >> Regards > >> Anthony > >> > >> On Wed, Jan 12, 2022 at 8:02 AM Fabio Vitali <fabio.vitali@unibo.it> wrote: > >> Dear Pierre-Antoine, > >> > >> > 1) Do you want to modify the core of RDF / RDF-star, replacing their notion of statement by the one you propose here (time+place annotated, complex and/or compound)? > >> > >> > >> I think with you that RDFstar already provides a lot of what has been discussed so far. > >> > >> Yet Anthony explicitly mentions (and I agree with him) that RDFstar has the right approach for single triples, but is lacking in supporting the needs for complex and compound statements. Working towards some suggestions to integrate these needs would enrich and complete the RDFstar proposal. > >> > >> My preference would go towards exploiting named graphs, explicitly introducing unasserted named graphs that can then be used in RDFstar in the same way of unasserted triples. > >> > >> > 2) Or do you want to explore how your proposed notion of statement could be expressed *on top* of RDF / RDF-star, with no or minimal modification to them? > >> > >> I do not know Anthony's point of view on this, but I believe that it would be useful to think of a resource providing some thoughtful and general guidelines on how RDFstar's quoted and annotated triples (as well as, hopefully, the RDFstar's quoted and annotated named graphs that I envision) could help in expressing conditional, time-dependent, location-dependent, uncertain, opinionated and competing statements. > >> > >> What I am thinking is something like, say, a W3C note, on the lines of https://www.w3.org/TR/swbp-n-aryRelations/ : a document introducing no new features, but explaining and making examples on how to use the existing features in a possibly unexpected and innovative way. > >> > >> What do you think? > >> > >> Fabio > >> > >> -- > >> > >> > On 11 Jan 2022, at 15:43, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote: > >> > > >> > Hi Anthony, > >> > > >> > thanks for the summary. It's hard to catch up for those of us who went offline during the break :-) > >> > > >> > On 08/01/2022 10:40, Anthony Moretti wrote: > >> >> Hi > >> >> > >> >> I thought I'd put the ideas I shared during the longer discussion in one place to make it easier for people to read and give feedback. I love what's been achieved so far, I just want whatever is released to be the best possible thing that could be released. > >> > What is not entirely clear to me is how you see the ideas below interact with RDF-star —or RDF, for that matter... > >> > > >> > 1) Do you want to modify the core of RDF / RDF-star, replacing their notion of statement by the one you propose here (time+place annotated, complex and/or compound)? > >> > > >> > 2) Or do you want to explore how your proposed notion of statement could be expressed *on top* of RDF / RDF-star, with no or minimal modification to them? > >> > > >> > If the answer is 2 (my favorite option, by the way), then the idea is to model anthony-statements using a set of rdf-statements (possibly extended with RDF-star). I think it would help the discussion a lot to a) acknowledge that the word "statement" in this discussion is ambiguous, and b) to be as explicit as possible about which kind we are talking about. > >> > > >> > I also have a few comments on the two first ideas: > >> > > >> >> (...) > >> >> > >> >> Summary: > >> >> 1. Optional time, space, and certainty positions. > >> > I am uncomfortable with "hard-coding" these 4 dimensions, and only them, in every possible statement. I think that the relevant dimensions depend on the relation itself (e.g., the birth-date of a person is neither time nor place dependent; the president of a country is not place dependent...). And I don't think that any list of contextual dimension can be exhaustive. > >> > > >> > Especially regarding certainty, there are many ways to model uncertainty (not all of them modelling it with a single value between 0 and 1, by the way). On that particular topic, you might be interested in this paper: https://hal.inria.fr/hal-02167174/file/Publishing_Uncertainty_on_the_Semantic_Web__Bursting_the_LOD_bubbles__Final_Version_.pdf > >> > > >> >> 2. Separating additional data from metadata. > >> > Do you have any clear definition, or at least guidelines, to decide whether a piece of information is additional data or metadata? > >> > > >> > best > >> > > >> >> 3. Simple, compound, and complex statements. > >> >> - - - > >> >> > >> >> 1. Optional time, space, and certainty positions > >> >> > >> >> We exist in time and space, and this type of modeling could possibly be easier. A statement would have four optional positions, leaving the time and space positions blank would mean "unbounded", and leaving the last position blank would mean 1.0: > >> >> > >> >> Subject Relation Object T1 T2 SpatialBound Certainty > >> >> > >> >> Examples: > >> >> > >> >> :RichardB :marriedTo :LizT 1964 1974 > >> >> :RichardB :marriedTo :LizT 1975 1976 > >> >> > >> >> :BigMac :price-USD 7.30 T1 T2 :Switzerland > >> >> :BigMac :price-USD 1.62 T1 T2 :India > >> >> > >> >> If anybody has worked with temporal databases they might see an analogy with "valid times". By extension, the spatial bound could be thought of as a "valid place". > >> >> > >> >> 2. Separating additional data from metadata > >> >> > >> >> This would remove a lot of ambiguity and creates a clear order of assertion. It also seems to match the Wikidata data model. > >> >> > >> >> Example: > >> >> > >> >> :LizT :starredIn :JaneEyre > >> >> { > >> >> :role :HelenBurns, > >> >> :pay-USD 10000, > >> >> } > >> >> {| > >> >> :statedBy :Bob, > >> >> :statedIn :Wikipedia, > >> >> |} > >> >> > >> >> 3. Simple, compound, and complex statements > >> >> > >> >> Taking inspiration from linguistics, there could be four different types of statements: > >> >> > >> >> 1. Simple statement > >> >> 2. Compound statement > >> >> 3. Complex statement > >> >> 4. Compound-complex statement > >> >> > >> >> Simple statement (binary relationship): > >> >> S R O T1 T2 SB C > >> >> > >> >> Compound statement (graph): > >> >> { > >> >> S R O T1 T2 SB C, > >> >> S R O T1 T2 SB C, > >> >> S R O T1 T2 SB C, > >> >> } > >> >> T1 T2 SB C > >> >> > >> >> Complex statement (n-ary relationship): > >> >> S R O T1 T2 SB C > >> >> { > >> >> R O T1 T2 SB C, > >> >> R O T1 T2 SB C, > >> >> } > >> >> > >> >> Compound-complex statement (n-ary relationship): > >> >> { > >> >> S R O T1 T2 SB C, > >> >> S R O T1 T2 SB C, > >> >> S R O T1 T2 SB C, > >> >> } > >> >> T1 T2 SB C > >> >> { > >> >> R O T1 T2 SB C, > >> >> R O T1 T2 SB C, > >> >> } > >> >> > >> >> This creates consistency, and makes it easy to reason about the temporal/spatial validity of any graph. > >> >> > >> >> The existing RDF-Star "<<" and ">>" delimiters could be applied to statements of any type to say that a statement was "neutrally asserted", as I think Pat has described it before. Maybe for completeness, and based on something Pat published, other delimiters could be created that would mean "negatively asserted", something like "<!" and "!>" for example. > >> >> > >> >> The existing RDF-Star "{|" and "|}" delimiters could be applied to statements of any type to add metadata. The example in Section 2 of this email is an example of a complex statement with metadata. > >> >> > >> >> And I'm not sure, but it seems that nesting statements could be a general solution to contexts, the deepest nested statements would be in the most specific contexts. I haven't examined it properly though. > >> >> > >> >> If you've made it here thanks for reading! If you need more examples please ask and I'll do my best. I love everything done so far, I just want to bounce around these additional ideas with the hope that they're constructive. Please reply with any feedback at all, good and bad, it's all welcome! > >> >> > >> >> Regards > >> >> Anthony > >> > <OpenPGP_0x9D1EDAEEEF98D438.asc> > >> > > <OpenPGP_0x9D1EDAEEEF98D438.asc> >
Received on Friday, 14 January 2022 07:40:24 UTC