Re: RDF based messaging, negotiating, and dataset semantics from Graham Klyne on 2017-07-09 (semantic-web@w3.org from July 2017)

From: Graham Klyne <gk@ninebynine.org>
Date: Sun, 09 Jul 2017 08:51:55 +0100
To: semantic-web@w3.org
Message-ID: <5961E09B.1000705@ninebynine.org>
Concerning RDF representation of updates to data resources (e.g. RDF graph 
containers and more)...

I did some exploratory RDF modelling a couple of years ago using a combination 
of PROV [1] and (non-standard) duri: [2] URIs to capture something like this [3] 
(the actual modelling is very rough, just intended to explore an idea).

The key ideas here were separate URIs for static versions and dynamic resource 
instances (like W3C specs) and use of PROV terms (including 
prov:specialization_of) to tie them all together.

#g
--

[1] http://www.w3.org/TR/prov-o/

[2] https://tools.ietf.org/html/draft-masinter-dated-uri

[3] 
http://demo.annalist.net/annalist/c/artivity_example/d/Entity/20150602T143220/ - 
the "data" button shows underlying JSON-LD



On 09/07/2017 05:54, Pat Hayes wrote:
>
>> On Jul 7, 2017, at 3:11 AM, Florian Kleedorfer <florian.kleedorfer@austria.fm> wrote:
>>
>> I'm very concerned about your warning of problems and confusions caused by treating delete/undelete actions as RDF properties. Which problems are we inviting?
>>
>> As I see it, after every message added to the dataset there is an unambiguous way to compute the set of triples that are asserted by the actors (I.e., the triples contained in all messages' payload graphs except for the ones that were later referred to as 'deleted’).
>
> And that set is an RDF graph. OK.
>
>> We can always record the URI of the last message together with any result we derive from that set of triples (e.g. results of SPARQL queries or SHACL rule evaluations on the triples ) and in this way, the secondary results are versioned. As long as we are careful to recompute secondary results when a new message is sent, we should be fine, no?
>
> I guess so, if I follow you. The potential muddle that I was concerend about was one that cropped up repeatedly in the RDF 1.1 WG discussions, a confusion between an RDF graph, defined as a set of triples (on the one hand) and a dynamic entity which retained its identity while triples were added to or delelted from it (on the other hand), what one might call an RDF web resource (other names have been suggested.) As long as you are clear, as you seem indeed to be, that URIs which denote RDF graphs (the first case) cannot change their referents dynamically, then you should be OK, yes.
>
>> Am 06.07.2017 um 20:36 schrieb Pat Hayes:
>>> There is a conceptual bug in this whole discussion: delete (and undelete) are not RDF properties. Treating them as though they were is going to cause a host of issues and confusions.
>> Do you mean, confusions in this email thread because of unprecise language (which I will admit to) or do you mean, if implemented in the way described, this system will cause bugs or other unintended effects?
>
> I guess I mean the former, but I worry about the latter :-)
>
>>> In general, operations on RDF graphs are not RDF properties. Properties simply record facts, they do not ‘do’ anything.
>> Definitely. Also, there is no notion of sequence or time in RDF. We are talking about a way to interpret an RDF dataset according to special rules.
>
> OK, fair enough. I agree that such special interpretations may (perhaps should) go beyond the bare semantic requirements of the RDF spec.
>
>>> Think of the properties as simply being a record of what changes were made, and then there is no ambiguity: restoring a message that was previously deleted, and deleting the record of its first deletion, are different and quite distinct changes (indeed, changes to different graphs, in one design) and should each be recorded separately and unambiguously.
>> Agreed - almost: an undelete and a delete of a delete express the same intention of the user, only using different tools offered by the system.
>
> Im not sure I agree. Even conceptually, it seems to me that a deletion of a deletion expresses the idea that the deletion itself was somehow an error or something to be erased, whereas an undelete – which could be called a restoration – simply re-asserts the formerly deleted object without denying that it was once deleted. I would expect that a historical trace would give different results in those two cases. I would also expect that anyone concerned with security or legal issues might want to distinguish them.
>
>> If users can only delete a delete to restore a message, that's what they will do, if they can choose between that and restoring the message directly, I get the two different and distinct ways to change the data which should be recorded unambiguously (that's the part where I agree). The question is: should we allow users to choose?
>
> Good question, but not itself relevant to RDF :-) My only point here is that whatever you decide, you should bear in mind that the RDF description of the time-sequence of events should not itself be time-dependent. (I hope that makes sense :-)
>
>> I am actually inclined to change the design so it does not support restoring messages and see if that is sufficient for our purposes. Makes many things easier.
>
> You might (?) allow a notation to the effect that an added item is a copy or repetition of an earlier one, so that (undelete foo) becomes a conjunction of simply (re)asserting foo with the added notation that foo is a copy of the earlier version. But perhaps this would not be helpful, it was just a quick thought.
>
> Best wishes
>
> Pat Hayes
>
>>
>>>
>>> Pat Hayes
>>>
>>>
>>>> On Jul 6, 2017, at 3:11 AM, Florian Kleedorfer <florian.kleedorfer@austria.fm> wrote:
>>>>
>>>> Kevin Singer pointed out to me that there is a downside to the 'ex:undeletes' property: it introduces unnecessary complexity and ambiguity. To undelete a deleted message, m1 one could delete the message m2 that deleted m1, or one could explicitly undelete m1. This ambiguity may lead to more complex implementations.
>>>>
>>>> The advantage of the 'ex:undelete' property is that one can easily determine which URI to use for the object: just the uri of the message to be deleted. In order to undelete a deleted message m1 without the 'ex:undeletes' property, one has to find the last 'ex:delete' statement in a possibly long chain of 'ex:delete' statements the first of which deletes m1.
>>>>
>>>> So I currently see three options:
>>>> 1. Leave the design as it is
>>>> 2. Remove  the 'ex:undeletes' property
>>>> 3. Fix the ambiguity problem of the suggested design by disallowing to delete delete statements.
>>>>
>>>> Any thoughts on which to prefer?
>>>>
>>>> Am 04.07.2017 um 20:47 schrieb Florian Kleedorfer:
>>>>> Thanks for all your contributions! From what I can gather there does not seem to be an existing approach for what I need, so here's an informal attempt:
>>>>>
>>>>> For editing of the message history, I think we only need two properties, one for deleting a previously added named graph from the dataset (e.g., ex:msg1 ex:deletes ex:msg2), and one for undeleting a named graph (e.g., ex:msg3 ex:undeletes ex:msg2).
>>>>>
>>>>> For determining the meaning of the dataset, one would iterate over the named graphs in reverse chronological order and build a set 'del' of named graph URIs that are to be interpreted as deleted. For each message, it is only processed if its URI is not in del. Whenever an 'ex:deletes' triple is encountered, the URI in the object of the triple is added to del. Whenever an 'ex:undeletes' triple is encountered, the URI in the object is removed from del. Both operations are only executed when the sender of the deletes/undeletes message is also the sender of the message to be deleted/undeleted. Each processed message (named graph) is added to the result dataset.
>>>>>
>>>>> The case of negotiation, I think requires two additional properties, 'ex:proposes' (range: Message) and 'ex:agreesWith' (range: Message) . 'ex:proposes' indicates that the 'proposed' message is not just any statement, but one that the sender wants the recipient's agreement on.  'ex:agreesWith' indicates that the sender agrees with the content of another message, the default interpretation being that nobody agrees with anything.
>>>>>
>>>>> Such messages can also be deleted as described above, with a later 'deletes' message - allowing for dynamically proposing, accepting and un-accepting graphs (which may be suggestions for clauses in a contract, for example).
>>>>>
>>>>> So, when the full conversation dataset has been filtered based on deletes/undeletes information as explained above, one can decide whether the agents agree. When all messages that have been 'proposed' by one agent are 'agreed' to by the other, the participants can be said to agree. If there is at least one proposed message that is not agreed to, the participants disagree. Otherwise (if no messages are proposed), there is no agreement status.
>>>>>
>>>>> In case of agreement, the dataset can be filtered easily to select only the graphs that are part of the agreement. In case of disagreement, it should be easily possible to determine agreed-upon graphs and graphs that are proposed by each agent but not agreed to by the other.
>>>>>
>>>>> Again, comments welcome!
>>>>>
>>>>> Cheers,
>>>>> Florian
>>>>>
>>>>> Am 03.07.2017 um 16:17 schrieb Florian Kleedorfer:
>>>>>> Hi,
>>>>>>
>>>>>> Consider a communication channel between two agents who exchange messages in the form of named RDF Graphs. The channel allows for adding new messages but not for removing any data. The history of the channel is unambiguous and always accessible to both agents. This construct can be seen as an RDF dataset that both agents have read/write but no replace or delete access to. Its use is that of a negotiation device that allows for setting up terms of a contract.
>>>>>>
>>>>>> The way the system is built, the messages consist of any number of 'content' RDF graphs (the message's payload), 'envlope' graphs with address information (sender, recipient etc),  and graphs containing cryptographic signatures.
>>>>>>
>>>>>> What's needed is an approach that allows these agents to make assertions about earlier messages (their content graphs) in the conversation dataset so as to modify the meaning of the dataset.
>>>>>>
>>>>>> The simplest example I can think of is that one agent might realize they made a typing error in an earlier message and want to correct the information by sending a message stating that the earlier graph should be disregarded and another message containing the corrected information.
>>>>>>
>>>>>> Similar situations occur when negotiating aspects of the agreement, e.g. price.
>>>>>>
>>>>>> For both agents, at any point in the conversation, the meaning of the conversation dataset must always be unambiguous and equal, and it must be clear to both agents if they agree (both hold the same graphs true) or if there is a conflict.
>>>>>>
>>>>>> I am contemplating defining a vocabulary that allows for making such statements and defining dataset semantics that take these statements into account, unless I find a suitable existing approach. I found the SWP (Semantic Web Publishing) vocabulary, which is intended to do something similar, but does not seem to have a negative property for rejecting a graph, so I'm not convinced. Any Ideas, pointers, or followup discussions are greatly appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> Florian
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
>
>
>
Received on Sunday, 9 July 2017 07:52:32 UTC