Re: RDF based messaging, negotiating, and dataset semantics from Krzysztof Janowicz on 2017-07-10 (semantic-web@w3.org from July 2017)

From: Krzysztof Janowicz <janowicz@ucsb.edu>
Date: Mon, 10 Jul 2017 10:41:07 +0200
To: Florian Kleedorfer <florian.kleedorfer@austria.fm>, Pat Hayes <phayes@ihmc.us>, Simon.Cox@csiro.au
Cc: Graham Klyne <gk@ninebynine.org>, semantic-web@w3.org
Message-ID: <54799973-e0ee-2458-9464-ec61419aadec@ucsb.edu>
On 07/10/2017 10:24 AM, Florian Kleedorfer wrote:
> I am sorry, I don't think I follow. First, I have to interpret the 
> issue of mutable RDF graphs in the context of our application:
>
> As much as I would like immutability of URI-addressed datasets in our 
> application, we don't have it now and we may never get there.
>
> I find the RDF dataset very useful for grouping and addressing 
> triples. My plan was to provide a feature for partial immutability by 
> using cryptographic hashing and signing of graphs that are defined to 
> be immutable by the author, such that clients can always check that 
> nothing changed. However, some graphs need to be updated to reflect 
> dynamic state (e.g., a dataset describing a taxi offering might 
> include its current location, updated every 30 seconds). Such graphs 
> should not be marked as immutable.

In fact, this also happens at the intersection of RDF/Linked Data and 
RESTful applications such as (semantically-enabled) sensor observation 
services.


Best,
Krzysztof


>
> Now from the logical point of view, those datasets are always static 
> because logic is inherently atemporal. The moment I dereference the 
> dataset's URI, I get an RDF dataset, add it to whatever data I already 
> have, and that's it - I get a static model with RDF graphs that are 
> just named sets of triples. If I re-crawl the dataset and reconstruct 
> my model, some graph may have changed and I may get a different model, 
> but I don't see a problem from the logical point of view.
>
> If it was a problem, it seems to me, RDF databases were wrong to 
> support SPARQL Update, because it allows changes to RDF graphs - but I 
> never read anywhere that that is problematic.
>
> Is there anything I am missing?
>
> Best
>
> Florian
>
> Am 09.07.2017 um 22:34 schrieb Pat Hayes:
>>
>>> On Jul 9, 2017, at 1:24 AM, Simon.Cox@csiro.au 
>>> <mailto:Simon.Cox@csiro.au> wrote:
>>>
>>> >> URIs which denote RDF graphs
>>>
>>> Surely the issue is whether the URI denotes a *static* rdf graph.
>>
>> There is no such thing as a dynamic RDF graph. An RDF graph is 
>> defined to be a (mathematical) *set* of RDF triples, so is ‘static’ 
>> by definition.
>>>
>>> A URI denotes a _resource_.
>>
>> True, but meaningless, since the Web usage of ‘resource’ has it 
>> meaning nothing more definite than ‘thing’.
>>
>>> The question about whether that resource is static or not is a 
>>> separate contract. In general I would expect that a well behaved 
>>> system is likely to have a distinct URI for static and changeable 
>>> content, but Thu issue is not inherent in either URIs or rdf graphs.
>>
>> Not in URIs, but it is in RDF graphs. Of course, a URI may denote 
>> some dynamic, changeable data structure containing RDF triples, but 
>> such a thing is not an RDF graph. (It might be modeled as a function 
>> from times to RDF graphs, for example.)
>>
>> Best
>>
>> Pat Hayes
>>
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>> *From:*Graham Klyne <gk@ninebynine.org <mailto:gk@ninebynine.org>>
>>> *Sent:*Sunday, 9 July 2017 7:51:55 AM
>>> *To:*semantic-web@w3.org <mailto:semantic-web@w3.org>
>>> *Subject:*Re: RDF based messaging, negotiating, and dataset semantics
>>> Concerning RDF representation of updates to data resources (e.g. RDF 
>>> graph
>>> containers and more)...
>>>
>>> I did some exploratory RDF modelling a couple of years ago using a 
>>> combination
>>> of PROV [1] and (non-standard) duri: [2] URIs to capture something 
>>> like this [3]
>>> (the actual modelling is very rough, just intended to explore an idea).
>>>
>>> The key ideas here were separate URIs for static versions and 
>>> dynamic resource
>>> instances (like W3C specs) and use of PROV terms (including
>>> prov:specialization_of) to tie them all together.
>>>
>>> #g
>>> -- 
>>>
>>> [1]http://www.w3.org/TR/prov-o/
>>>
>>> [2]https://tools.ietf.org/html/draft-masinter-dated-uri
>>>
>>> [3]
>>> http://demo.annalist.net/annalist/c/artivity_example/d/Entity/20150602T143220/- 
>>>
>>> the "data" button shows underlying JSON-LD
>>>
>>>
>>>
>>> On 09/07/2017 05:54, Pat Hayes wrote:
>>> >
>>> >> On Jul 7, 2017, at 3:11 AM, Florian Kleedorfer 
>>> <florian.kleedorfer@austria.fm 
>>> <mailto:florian.kleedorfer@austria.fm>> wrote:
>>> >>
>>> >> I'm very concerned about your warning of problems and confusions 
>>> caused by treating delete/undelete actions as RDF properties. Which 
>>> problems are we inviting?
>>> >>
>>> >> As I see it, after every message added to the dataset there is an 
>>> unambiguous way to compute the set of triples that are asserted by 
>>> the actors (I.e., the triples contained in all messages' payload 
>>> graphs except for the ones that were later referred to as 'deleted’).
>>> >
>>> > And that set is an RDF graph. OK.
>>> >
>>> >> We can always record the URI of the last message together with 
>>> any result we derive from that set of triples (e.g. results of 
>>> SPARQL queries or SHACL rule evaluations on the triples ) and in 
>>> this way, the secondary results are versioned. As long as we are 
>>> careful to recompute secondary results when a new message is sent, 
>>> we should be fine, no?
>>> >
>>> > I guess so, if I follow you. The potential muddle that I was 
>>> concerend about was one that cropped up repeatedly in the RDF 1.1 WG 
>>> discussions, a confusion between an RDF graph, defined as a set of 
>>> triples (on the one hand) and a dynamic entity which retained its 
>>> identity while triples were added to or delelted from it (on the 
>>> other hand), what one might call an RDF web resource (other names 
>>> have been suggested.) As long as you are clear, as you seem indeed 
>>> to be, that URIs which denote RDF graphs (the first case) cannot 
>>> change their referents dynamically, then you should be OK, yes.
>>> >
>>> >> Am 06.07.2017 um 20:36 schrieb Pat Hayes:
>>> >>> There is a conceptual bug in this whole discussion: delete (and 
>>> undelete) are not RDF properties. Treating them as though they were 
>>> is going to cause a host of issues and confusions.
>>> >> Do you mean, confusions in this email thread because of unprecise 
>>> language (which I will admit to) or do you mean, if implemented in 
>>> the way described, this system will cause bugs or other unintended 
>>> effects?
>>> >
>>> > I guess I mean the former, but I worry about the latter :-)
>>> >
>>> >>> In general, operations on RDF graphs are not RDF properties. 
>>> Properties simply record facts, they do not ‘do’ anything.
>>> >> Definitely. Also, there is no notion of sequence or time in RDF. 
>>> We are talking about a way to interpret an RDF dataset according to 
>>> special rules.
>>> >
>>> > OK, fair enough. I agree that such special interpretations may 
>>> (perhaps should) go beyond the bare semantic requirements of the RDF 
>>> spec.
>>> >
>>> >>> Think of the properties as simply being a record of what changes 
>>> were made, and then there is no ambiguity: restoring a message that 
>>> was previously deleted, and deleting the record of its first 
>>> deletion, are different and quite distinct changes (indeed, changes 
>>> to different graphs, in one design) and should each be recorded 
>>> separately and unambiguously.
>>> >> Agreed - almost: an undelete and a delete of a delete express the 
>>> same intention of the user, only using different tools offered by 
>>> the system.
>>> >
>>> > Im not sure I agree. Even conceptually, it seems to me that a 
>>> deletion of a deletion expresses the idea that the deletion itself 
>>> was somehow an error or something to be erased, whereas an undelete 
>>> – which could be called a restoration – simply re-asserts the 
>>> formerly deleted object without denying that it was once deleted. I 
>>> would expect that a historical trace would give different results in 
>>> those two cases. I would also expect that anyone concerned with 
>>> security or legal issues might want to distinguish them.
>>> >
>>> >> If users can only delete a delete to restore a message, that's 
>>> what they will do, if they can choose between that and restoring the 
>>> message directly, I get the two different and distinct ways to 
>>> change the data which should be recorded unambiguously (that's the 
>>> part where I agree). The question is: should we allow users to choose?
>>> >
>>> > Good question, but not itself relevant to RDF :-) My only point 
>>> here is that whatever you decide, you should bear in mind that the 
>>> RDF description of the time-sequence of events should not itself be 
>>> time-dependent. (I hope that makes sense :-)
>>> >
>>> >> I am actually inclined to change the design so it does not 
>>> support restoring messages and see if that is sufficient for our 
>>> purposes. Makes many things easier.
>>> >
>>> > You might (?) allow a notation to the effect that an added item is 
>>> a copy or repetition of an earlier one, so that (undelete foo) 
>>> becomes a conjunction of simply (re)asserting foo with the added 
>>> notation that foo is a copy of the earlier version. But perhaps this 
>>> would not be helpful, it was just a quick thought.
>>> >
>>> > Best wishes
>>> >
>>> > Pat Hayes
>>> >
>>> >>
>>> >>>
>>> >>> Pat Hayes
>>> >>>
>>> >>>
>>> >>>> On Jul 6, 2017, at 3:11 AM, Florian Kleedorfer 
>>> <florian.kleedorfer@austria.fm 
>>> <mailto:florian.kleedorfer@austria.fm>> wrote:
>>> >>>>
>>> >>>> Kevin Singer pointed out to me that there is a downside to the 
>>> 'ex:undeletes' property: it introduces unnecessary complexity and 
>>> ambiguity. To undelete a deleted message, m1 one could delete the 
>>> message m2 that deleted m1, or one could explicitly undelete m1. 
>>> This ambiguity may lead to more complex implementations.
>>> >>>>
>>> >>>> The advantage of the 'ex:undelete' property is that one can 
>>> easily determine which URI to use for the object: just the uri of 
>>> the message to be deleted. In order to undelete a deleted message m1 
>>> without the 'ex:undeletes' property, one has to find the last 
>>> 'ex:delete' statement in a possibly long chain of 'ex:delete' 
>>> statements the first of which deletes m1.
>>> >>>>
>>> >>>> So I currently see three options:
>>> >>>> 1. Leave the design as it is
>>> >>>> 2. Remove  the 'ex:undeletes' property
>>> >>>> 3. Fix the ambiguity problem of the suggested design by 
>>> disallowing to delete delete statements.
>>> >>>>
>>> >>>> Any thoughts on which to prefer?
>>> >>>>
>>> >>>> Am 04.07.2017 um 20:47 schrieb Florian Kleedorfer:
>>> >>>>> Thanks for all your contributions! From what I can gather 
>>> there does not seem to be an existing approach for what I need, so 
>>> here's an informal attempt:
>>> >>>>>
>>> >>>>> For editing of the message history, I think we only need two 
>>> properties, one for deleting a previously added named graph from the 
>>> dataset (e.g., ex:msg1 ex:deletes ex:msg2), and one for undeleting a 
>>> named graph (e.g., ex:msg3 ex:undeletes ex:msg2).
>>> >>>>>
>>> >>>>> For determining the meaning of the dataset, one would iterate 
>>> over the named graphs in reverse chronological order and build a set 
>>> 'del' of named graph URIs that are to be interpreted as deleted. For 
>>> each message, it is only processed if its URI is not in del. 
>>> Whenever an 'ex:deletes' triple is encountered, the URI in the 
>>> object of the triple is added to del. Whenever an 'ex:undeletes' 
>>> triple is encountered, the URI in the object is removed from del. 
>>> Both operations are only executed when the sender of the 
>>> deletes/undeletes message is also the sender of the message to be 
>>> deleted/undeleted. Each processed message (named graph) is added to 
>>> the result dataset.
>>> >>>>>
>>> >>>>> The case of negotiation, I think requires two additional 
>>> properties, 'ex:proposes' (range: Message) and 'ex:agreesWith' 
>>> (range: Message) . 'ex:proposes' indicates that the 'proposed' 
>>> message is not just any statement, but one that the sender wants the 
>>> recipient's agreement on. 'ex:agreesWith' indicates that the sender 
>>> agrees with the content of another message, the default 
>>> interpretation being that nobody agrees with anything.
>>> >>>>>
>>> >>>>> Such messages can also be deleted as described above, with a 
>>> later 'deletes' message - allowing for dynamically proposing, 
>>> accepting and un-accepting graphs (which may be suggestions for 
>>> clauses in a contract, for example).
>>> >>>>>
>>> >>>>> So, when the full conversation dataset has been filtered based 
>>> on deletes/undeletes information as explained above, one can decide 
>>> whether the agents agree. When all messages that have been 
>>> 'proposed' by one agent are 'agreed' to by the other, the 
>>> participants can be said to agree. If there is at least one proposed 
>>> message that is not agreed to, the participants disagree. Otherwise 
>>> (if no messages are proposed), there is no agreement status.
>>> >>>>>
>>> >>>>> In case of agreement, the dataset can be filtered easily to 
>>> select only the graphs that are part of the agreement. In case of 
>>> disagreement, it should be easily possible to determine agreed-upon 
>>> graphs and graphs that are proposed by each agent but not agreed to 
>>> by the other.
>>> >>>>>
>>> >>>>> Again, comments welcome!
>>> >>>>>
>>> >>>>> Cheers,
>>> >>>>> Florian
>>> >>>>>
>>> >>>>> Am 03.07.2017 um 16:17 schrieb Florian Kleedorfer:
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> Consider a communication channel between two agents who 
>>> exchange messages in the form of named RDF Graphs. The channel 
>>> allows for adding new messages but not for removing any data. The 
>>> history of the channel is unambiguous and always accessible to both 
>>> agents. This construct can be seen as an RDF dataset that both 
>>> agents have read/write but no replace or delete access to. Its use 
>>> is that of a negotiation device that allows for setting up terms of 
>>> a contract.
>>> >>>>>>
>>> >>>>>> The way the system is built, the messages consist of any 
>>> number of 'content' RDF graphs (the message's payload), 'envlope' 
>>> graphs with address information (sender, recipient etc),  and graphs 
>>> containing cryptographic signatures.
>>> >>>>>>
>>> >>>>>> What's needed is an approach that allows these agents to make 
>>> assertions about earlier messages (their content graphs) in the 
>>> conversation dataset so as to modify the meaning of the dataset.
>>> >>>>>>
>>> >>>>>> The simplest example I can think of is that one agent might 
>>> realize they made a typing error in an earlier message and want to 
>>> correct the information by sending a message stating that the 
>>> earlier graph should be disregarded and another message containing 
>>> the corrected information.
>>> >>>>>>
>>> >>>>>> Similar situations occur when negotiating aspects of the 
>>> agreement, e.g. price.
>>> >>>>>>
>>> >>>>>> For both agents, at any point in the conversation, the 
>>> meaning of the conversation dataset must always be unambiguous and 
>>> equal, and it must be clear to both agents if they agree (both hold 
>>> the same graphs true) or if there is a conflict.
>>> >>>>>>
>>> >>>>>> I am contemplating defining a vocabulary that allows for 
>>> making such statements and defining dataset semantics that take 
>>> these statements into account, unless I find a suitable existing 
>>> approach. I found the SWP (Semantic Web Publishing) vocabulary, 
>>> which is intended to do something similar, but does not seem to have 
>>> a negative property for rejecting a graph, so I'm not convinced. Any 
>>> Ideas, pointers, or followup discussions are greatly appreciated!
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Florian
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>> >
>>> >
>>> >
>>
>
>

-- 
Krzysztof Janowicz

Geography Department, University of California, Santa Barbara
4830 Ellison Hall, Santa Barbara, CA 93106-4060

Email: jano@geog.ucsb.edu
Webpage: http://geog.ucsb.edu/~jano/
Semantic Web Journal: http://www.semantic-web-journal.net
Received on Monday, 10 July 2017 08:41:45 UTC