Re: RDF based messaging, negotiating, and dataset semantics from Pat Hayes on 2017-07-10 (semantic-web@w3.org from July 2017)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 10 Jul 2017 08:41:36 -0700
To: Florian Kleedorfer <florian.kleedorfer@austria.fm>
Cc: Simon.Cox@csiro.au, Graham Klyne <gk@ninebynine.org>, semantic-web@w3.org
Message-Id: <E1D01467-3BF3-453A-BB83-CCEB326F8CDF@ihmc.us>
> On Jul 10, 2017, at 1:24 AM, Florian Kleedorfer <florian.kleedorfer@austria.fm> wrote:
> 
> I am sorry, I don't think I follow. First, I have to interpret the issue of mutable RDF graphs in the context of our application:
> 
> As much as I would like immutability of URI-addressed datasets in our application, we don't have it now and we may never get there.
> 
> I find the RDF dataset very useful for grouping and addressing triples. My plan was to provide a feature for partial immutability by using cryptographic hashing and signing of graphs that are defined to be immutable by the author, such that clients can always check that nothing changed. However, some graphs need to be updated to reflect dynamic state (e.g., a dataset describing a taxi offering might include its current location, updated every 30 seconds). Such graphs should not be marked as immutable.

I understand your point and your problem, but I take issue with your terminology. The term “RDF graph” is *defined* in the normative RDF specifications to be a mathematical set of RDF triples. Mathematical sets are not mutable. So when you say “mutable RDF graph”, you are uttering an oxymoron. Please call them RDF datasets or use some other terminology. 

> 
> Now from the logical point of view, those datasets are always static because logic is inherently atemporal.

No, really, it has nothing to do with points of view, and logic is not "inherently atemporal” (there is an entire field of tense logic, for example). It is RDF graphs which are inherently atemporal. 

> The moment I dereference the dataset's URI, I get an RDF dataset, add it to whatever data I already have, and that's it - I get a static model with RDF graphs that are just named sets of triples. If I re-crawl the dataset and reconstruct my model, some graph may have changed and I may get a different model,

You may get a differerent graph. Your terminology of ‘model’ has no meaning in RDF. But you need to be very clear what exactly your URI is naming here. If it is naming an RDF graph, then its meaning is *changing* when that graph changes to a different graph. If it is naming a dynamic datstructure which encodes a graph at every instant - mathematically, a fuction from times or states to RDF graphs - then it can be the same dynamic entity when a change occurs. I expect that the latter is what you want, following the whole cool-URI philosophy. But this dynamic entity is not an RDF graph.

> but I don't see a problem from the logical point of view.
> 
> If it was a problem, it seems to me, RDF databases were wrong to support SPARQL Update, because it allows changes to RDF graphs - but I never read anywhere that that is problematic.

Well, it is probably best if I refrain from comment on the design of SPARQL :-)

Pat

> 
> Is there anything I am missing?
> 
> Best
> 
> Florian
> 
> Am 09.07.2017 um 22:34 schrieb Pat Hayes:
>> 
>>> On Jul 9, 2017, at 1:24 AM, Simon.Cox@csiro.au <mailto:Simon.Cox@csiro.au> wrote:
>>> 
>>> >> URIs which denote RDF graphs
>>> 
>>> Surely the issue is whether the URI denotes a *static* rdf graph.
>> 
>> There is no such thing as a dynamic RDF graph. An RDF graph is defined to be a (mathematical) *set* of RDF triples, so is ‘static’ by definition.
>>> 
>>> A URI denotes a _resource_.
>> 
>> True, but meaningless, since the Web usage of ‘resource’ has it meaning nothing more definite than ‘thing’.
>> 
>>> The question about whether that resource is static or not is a separate contract. In general I would expect that a well behaved system is likely to have a distinct URI for static and changeable content, but Thu issue is not inherent in either URIs or rdf graphs.
>> 
>> Not in URIs, but it is in RDF graphs. Of course, a URI may denote some dynamic, changeable data structure containing RDF triples, but such a thing is not an RDF graph. (It might be modeled as a function from times to RDF graphs, for example.)
>> 
>> Best
>> 
>> Pat Hayes
>> 
>>> 
>>> ------------------------------------------------------------------------
>>> *From:*Graham Klyne <gk@ninebynine.org <mailto:gk@ninebynine.org>>
>>> *Sent:*Sunday, 9 July 2017 7:51:55 AM
>>> *To:*semantic-web@w3.org <mailto:semantic-web@w3.org>
>>> *Subject:*Re: RDF based messaging, negotiating, and dataset semantics
>>> Concerning RDF representation of updates to data resources (e.g. RDF graph
>>> containers and more)...
>>> 
>>> I did some exploratory RDF modelling a couple of years ago using a combination
>>> of PROV [1] and (non-standard) duri: [2] URIs to capture something like this [3]
>>> (the actual modelling is very rough, just intended to explore an idea).
>>> 
>>> The key ideas here were separate URIs for static versions and dynamic resource
>>> instances (like W3C specs) and use of PROV terms (including
>>> prov:specialization_of) to tie them all together.
>>> 
>>> #g
>>> --
>>> 
>>> [1]http://www.w3.org/TR/prov-o/
>>> 
>>> [2]https://tools.ietf.org/html/draft-masinter-dated-uri
>>> 
>>> [3]
>>> http://demo.annalist.net/annalist/c/artivity_example/d/Entity/20150602T143220/-
>>> the "data" button shows underlying JSON-LD
>>> 
>>> 
>>> 
>>> On 09/07/2017 05:54, Pat Hayes wrote:
>>> >
>>> >> On Jul 7, 2017, at 3:11 AM, Florian Kleedorfer <florian.kleedorfer@austria.fm <mailto:florian.kleedorfer@austria.fm>> wrote:
>>> >>
>>> >> I'm very concerned about your warning of problems and confusions caused by treating delete/undelete actions as RDF properties. Which problems are we inviting?
>>> >>
>>> >> As I see it, after every message added to the dataset there is an unambiguous way to compute the set of triples that are asserted by the actors (I.e., the triples contained in all messages' payload graphs except for the ones that were later referred to as 'deleted’).
>>> >
>>> > And that set is an RDF graph. OK.
>>> >
>>> >> We can always record the URI of the last message together with any result we derive from that set of triples (e.g. results of SPARQL queries or SHACL rule evaluations on the triples ) and in this way, the secondary results are versioned. As long as we are careful to recompute secondary results when a new message is sent, we should be fine, no?
>>> >
>>> > I guess so, if I follow you. The potential muddle that I was concerend about was one that cropped up repeatedly in the RDF 1.1 WG discussions, a confusion between an RDF graph, defined as a set of triples (on the one hand) and a dynamic entity which retained its identity while triples were added to or delelted from it (on the other hand), what one might call an RDF web resource (other names have been suggested.) As long as you are clear, as you seem indeed to be, that URIs which denote RDF graphs (the first case) cannot change their referents dynamically, then you should be OK, yes.
>>> >
>>> >> Am 06.07.2017 um 20:36 schrieb Pat Hayes:
>>> >>> There is a conceptual bug in this whole discussion: delete (and undelete) are not RDF properties. Treating them as though they were is going to cause a host of issues and confusions.
>>> >> Do you mean, confusions in this email thread because of unprecise language (which I will admit to) or do you mean, if implemented in the way described, this system will cause bugs or other unintended effects?
>>> >
>>> > I guess I mean the former, but I worry about the latter :-)
>>> >
>>> >>> In general, operations on RDF graphs are not RDF properties. Properties simply record facts, they do not ‘do’ anything.
>>> >> Definitely. Also, there is no notion of sequence or time in RDF. We are talking about a way to interpret an RDF dataset according to special rules.
>>> >
>>> > OK, fair enough. I agree that such special interpretations may (perhaps should) go beyond the bare semantic requirements of the RDF spec.
>>> >
>>> >>> Think of the properties as simply being a record of what changes were made, and then there is no ambiguity: restoring a message that was previously deleted, and deleting the record of its first deletion, are different and quite distinct changes (indeed, changes to different graphs, in one design) and should each be recorded separately and unambiguously.
>>> >> Agreed - almost: an undelete and a delete of a delete express the same intention of the user, only using different tools offered by the system.
>>> >
>>> > Im not sure I agree. Even conceptually, it seems to me that a deletion of a deletion expresses the idea that the deletion itself was somehow an error or something to be erased, whereas an undelete – which could be called a restoration – simply re-asserts the formerly deleted object without denying that it was once deleted. I would expect that a historical trace would give different results in those two cases. I would also expect that anyone concerned with security or legal issues might want to distinguish them.
>>> >
>>> >> If users can only delete a delete to restore a message, that's what they will do, if they can choose between that and restoring the message directly, I get the two different and distinct ways to change the data which should be recorded unambiguously (that's the part where I agree). The question is: should we allow users to choose?
>>> >
>>> > Good question, but not itself relevant to RDF :-) My only point here is that whatever you decide, you should bear in mind that the RDF description of the time-sequence of events should not itself be time-dependent. (I hope that makes sense :-)
>>> >
>>> >> I am actually inclined to change the design so it does not support restoring messages and see if that is sufficient for our purposes. Makes many things easier.
>>> >
>>> > You might (?) allow a notation to the effect that an added item is a copy or repetition of an earlier one, so that (undelete foo) becomes a conjunction of simply (re)asserting foo with the added notation that foo is a copy of the earlier version. But perhaps this would not be helpful, it was just a quick thought.
>>> >
>>> > Best wishes
>>> >
>>> > Pat Hayes
>>> >
>>> >>
>>> >>>
>>> >>> Pat Hayes
>>> >>>
>>> >>>
>>> >>>> On Jul 6, 2017, at 3:11 AM, Florian Kleedorfer <florian.kleedorfer@austria.fm <mailto:florian.kleedorfer@austria.fm>> wrote:
>>> >>>>
>>> >>>> Kevin Singer pointed out to me that there is a downside to the 'ex:undeletes' property: it introduces unnecessary complexity and ambiguity. To undelete a deleted message, m1 one could delete the message m2 that deleted m1, or one could explicitly undelete m1. This ambiguity may lead to more complex implementations.
>>> >>>>
>>> >>>> The advantage of the 'ex:undelete' property is that one can easily determine which URI to use for the object: just the uri of the message to be deleted. In order to undelete a deleted message m1 without the 'ex:undeletes' property, one has to find the last 'ex:delete' statement in a possibly long chain of 'ex:delete' statements the first of which deletes m1.
>>> >>>>
>>> >>>> So I currently see three options:
>>> >>>> 1. Leave the design as it is
>>> >>>> 2. Remove  the 'ex:undeletes' property
>>> >>>> 3. Fix the ambiguity problem of the suggested design by disallowing to delete delete statements.
>>> >>>>
>>> >>>> Any thoughts on which to prefer?
>>> >>>>
>>> >>>> Am 04.07.2017 um 20:47 schrieb Florian Kleedorfer:
>>> >>>>> Thanks for all your contributions! From what I can gather there does not seem to be an existing approach for what I need, so here's an informal attempt:
>>> >>>>>
>>> >>>>> For editing of the message history, I think we only need two properties, one for deleting a previously added named graph from the dataset (e.g., ex:msg1 ex:deletes ex:msg2), and one for undeleting a named graph (e.g., ex:msg3 ex:undeletes ex:msg2).
>>> >>>>>
>>> >>>>> For determining the meaning of the dataset, one would iterate over the named graphs in reverse chronological order and build a set 'del' of named graph URIs that are to be interpreted as deleted. For each message, it is only processed if its URI is not in del. Whenever an 'ex:deletes' triple is encountered, the URI in the object of the triple is added to del. Whenever an 'ex:undeletes' triple is encountered, the URI in the object is removed from del. Both operations are only executed when the sender of the deletes/undeletes message is also the sender of the message to be deleted/undeleted. Each processed message (named graph) is added to the result dataset.
>>> >>>>>
>>> >>>>> The case of negotiation, I think requires two additional properties, 'ex:proposes' (range: Message) and 'ex:agreesWith' (range: Message) . 'ex:proposes' indicates that the 'proposed' message is not just any statement, but one that the sender wants the recipient's agreement on.  'ex:agreesWith' indicates that the sender agrees with the content of another message, the default interpretation being that nobody agrees with anything.
>>> >>>>>
>>> >>>>> Such messages can also be deleted as described above, with a later 'deletes' message - allowing for dynamically proposing, accepting and un-accepting graphs (which may be suggestions for clauses in a contract, for example).
>>> >>>>>
>>> >>>>> So, when the full conversation dataset has been filtered based on deletes/undeletes information as explained above, one can decide whether the agents agree. When all messages that have been 'proposed' by one agent are 'agreed' to by the other, the participants can be said to agree. If there is at least one proposed message that is not agreed to, the participants disagree. Otherwise (if no messages are proposed), there is no agreement status.
>>> >>>>>
>>> >>>>> In case of agreement, the dataset can be filtered easily to select only the graphs that are part of the agreement. In case of disagreement, it should be easily possible to determine agreed-upon graphs and graphs that are proposed by each agent but not agreed to by the other.
>>> >>>>>
>>> >>>>> Again, comments welcome!
>>> >>>>>
>>> >>>>> Cheers,
>>> >>>>> Florian
>>> >>>>>
>>> >>>>> Am 03.07.2017 um 16:17 schrieb Florian Kleedorfer:
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> Consider a communication channel between two agents who exchange messages in the form of named RDF Graphs. The channel allows for adding new messages but not for removing any data. The history of the channel is unambiguous and always accessible to both agents. This construct can be seen as an RDF dataset that both agents have read/write but no replace or delete access to. Its use is that of a negotiation device that allows for setting up terms of a contract.
>>> >>>>>>
>>> >>>>>> The way the system is built, the messages consist of any number of 'content' RDF graphs (the message's payload), 'envlope' graphs with address information (sender, recipient etc),  and graphs containing cryptographic signatures.
>>> >>>>>>
>>> >>>>>> What's needed is an approach that allows these agents to make assertions about earlier messages (their content graphs) in the conversation dataset so as to modify the meaning of the dataset.
>>> >>>>>>
>>> >>>>>> The simplest example I can think of is that one agent might realize they made a typing error in an earlier message and want to correct the information by sending a message stating that the earlier graph should be disregarded and another message containing the corrected information.
>>> >>>>>>
>>> >>>>>> Similar situations occur when negotiating aspects of the agreement, e.g. price.
>>> >>>>>>
>>> >>>>>> For both agents, at any point in the conversation, the meaning of the conversation dataset must always be unambiguous and equal, and it must be clear to both agents if they agree (both hold the same graphs true) or if there is a conflict.
>>> >>>>>>
>>> >>>>>> I am contemplating defining a vocabulary that allows for making such statements and defining dataset semantics that take these statements into account, unless I find a suitable existing approach. I found the SWP (Semantic Web Publishing) vocabulary, which is intended to do something similar, but does not seem to have a negative property for rejecting a graph, so I'm not convinced. Any Ideas, pointers, or followup discussions are greatly appreciated!
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Florian
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>> 
> 
>
Received on Monday, 10 July 2017 15:42:20 UTC