Re: lossless paging & volunteering for the army from henry.story@bblfish.net on 2014-02-22 (public-ldp@w3.org from February 2014)

From: <henry.story@bblfish.net>
Date: Sat, 22 Feb 2014 16:43:26 +0100
To: Sandro Hawke <sandro@w3.org>
Cc: Mark Baker <distobj@acm.org>, Steve K Speicher <sspeiche@gmail.com>, public-ldp <public-ldp@w3.org>, "Linked Data Platform (LDP) Working Group" <public-ldp-wg@w3.org>
Message-Id: <1C55E360-E579-4372-8CB4-DEC807FD527C@bblfish.net>
On 21 Feb 2014, at 22:19, Sandro Hawke <sandro@w3.org> wrote:

> My sense is the group is agreed that lossy paging is bad.  The only question is whether we allow conformant servers to do lossy paging.  Given the practical simplicity of the HATEAOS solution, I hope we're clear the answer is no.    That is, if you want to do paging, make it lossless.
> 
> If you still think lossy paging is good, or like to hear me and Henry discussing these things, read on.
> 
> 
> On 02/21/2014 01:16 PM, henry.story@bblfish.net wrote:
>> On 21 Feb 2014, at 18:05, Sandro Hawke <sandro@w3.org> wrote:
>> 
>>> On 02/21/2014 09:55 AM, Mark Baker wrote:
>>>> On Fri, Feb 21, 2014 at 4:24 AM, henry.story@bblfish.net
>>>> <henry.story@bblfish.net> wrote:
>>>>> I was thinking that there is an additional reason to have lossless paging
>>>>> which is best explained with the "Volunteering for the army" counter
>>>>> use-case. [1]
>>>>> 
>>>>> Given the recent resolution to name ldp:Container what was previously known
>>>>> as the ldp:IndirectContainer [2] a client doing a GET on an LDPC will not
>>>>> know
>>>>> if it can POST to the container without the danger of that action
>>>>> volunteering
>>>>> its owner to the army -- unless it has read through all pages of the
>>>>> container, which
>>>>> may be close to impossible as argued by Sandro.
>>>> AIUI, the POST to a container modifies it's membership. In order to
>>>> volunteer for the army, the client would need to POST directly to the
>>>> member that offers that service. Hopefully I haven't missed some
>>>> esoteric aspect of LDP containers during my review?
>>> The container is the resource that offers the service.  The container is an "army signup list", like a sign up list that might be posted on a wall.
>>> 
>>> Let me switch to a scenario that I find much more realistic, so I can fill in all the details.
>>> 
>>> Picture a room reservation service, where the container contains room reservation records.  If you have access to POST a room reservation to that container, and you do so, you've now reserved a room.    To cancel your reservation, you DELETE the resource that got created in the container.     People with access can GET the container to see the room reservations.  If they want, they might HEAD it instead and see if there's a FIRST link, so they can page through it more slowly.
>>> 
>>> If you reserve a lot of rooms that you're not supposed to, or you reserve rooms for inappropriate activities (that end up going viral), you might well lose your job.  And maybe that means you have no choice but to join the army.  :-)
>> Your example is more mundate for sure, and so does hides the dangers that I am trying to make crystal clear with the volunteering
>> for the army example. A compromise solution may be a LDPC form to buy a car or some big ticket purchase :-)
> 
> I thought mine was dramatic enough, since it involved both joining the army AND ruining an LDP F2F meeting.
> 
> More seriously, I found yours problematic, because we're a very long way from LDP actions being legally binding.

But neither should we dismiss that possibility. In the UK saying something can be legally binding. Of course my
example pushes things a bit far, but that is just to help concentrate our minds a bit.

> 
>>> Of course it wont be humans using curl to POST and GET and DELETE, it'll be some room reservation app, or some calendaring app.   I tend to use Google Calendar (on the web) and aCalendar (on android), myself.  It's not unreasonably that either or both of them would have access to some information about me, including links to where I work.    Maybe I link to http://www.csail.mit.edu/ from some profile of mine, saying I work there.  And that really does link (under "Book a Room") to https://calendar.csail.mit.edu/calendars/2/day which links to a Web form which submits to https://calendar.csail.mit.edu/meetings.  If I POST there with my (csail employee client cert), I can create room reservations. Right now, it's an HTML interface, but with LDP it could become an API, a Container of room reservations that Google Calendar and aCalendar could use to offer me room reservatations when I make a new calendar event.   They could also use it to show room reservations I have made.
>>> 
>>> Beyond LDP, what's needed to make this work is (1) a vocabulary for describing room reservations, used by CSAIL and my calendaring apps, and (2) the predicates that connect my profile to https://calendar.csail.mit.edu/meetings and convey what it is.   I think that's just foaf and/or org stuff, plus one special new predicate, something like: <organization> roomReservations <roomReservationsContainer>.   In this case it would be something like:
>>> 
>>> <http://www.csail.mit.edu/#csail> eg:roomReservations <https://calendar.csail.mit.edu/meetings>.
>>> 
>>> I'd expect to see that triple if I GET https://calendar.csail.mit.edu/meetings, which would be its way of claiming that it's a room reservations container for CSAIL.   That's nice, but one should not trust it.
>>> 
>>> I'd also expect to see that triple if I GET http://www.csail.mit.edu/ .  That would be how my apps would find this container, and that's something I would trust as an official statement from CSAIL that this is a way to really reserve rooms.
>> Ok. I think your example is straying somewhat from the point I was trying to make by considering issues of epistemology, in particular the question as to how do we get to trust that the container is the type of container it says it is. My counter-use-case is focused on considering
>> the protocol aspects relating to POSTing to an ldp:Container ( which used to be known as an ldp:IndirectContainer, but a similar argument
>> could be made to work with the ldp:DirectContainer ). Both the ldp:IndirectContainer and the ldp:DirectContainers create new relations in addition to the ones in the body of the POST and in addition to the ldp:contains relation. That is a client that POSTs to an ldp:SimpleContainer need only know that it is publishing the document it is POSTing. ( The same works for PUT). In the other containers the client needs to know that it is also creating an additional relation, and that this relation could be a statement about joining the army, buying a car, booking a room, etc...  Assuming the LDPC is describing itself correctly, a client needs to know when POSTing to such containers that it is doing something more than creating an information resource ( aka source in RDF1.1 ). It needs to understand the new relation it is going to create. And since a well programmed client will know what the effect of POSTing to such containers is, it will also be liable to the conesquences of the statement it creates.
> 
> I think you missed my point.  My point was that it's logically impossible for a client to NOT know how the container is supposed to behave, before it even does a HEAD on the container, because that behavior spec is likely to be an intrinsic part of the mechanism the client had to use to get the container URL.

That sounds very much like an exageeration: 
 • A search engine could crawl the web and give us a list of LDP Containers  it found on the web. So it is 
   possible for a client not to know what type of container it is dealing with until it does a GET or HEAD on the container.
 • It should be the container itself that should be the last and final point of reference as to what the container is. Why, because
  when you POST you may very well want to do a conditional POST on the container using an e-tag, so that your POST fails in case the container
  type itself changes.
 • The relations of other pages to the container, and the relation of that container back position the LDPC in a web of trust, which is very important of course, but not what I was trying to get at here.

> 
>> 
>>> As I've sketched it out here, Henry's concern does not manifest. The essential triple that told me I might be getting myself into trouble was outside the container resource itself, and I couldn't have found the container without knowing that predicate.
>>> 
>>> As I think about other ways the information could be laid out, I have similar results.  Whatever information led the client to think this container was useful at all could also be telling it how dangerous it is.
>>> 
>>> One COULD design a vocabulary with eg:RealRoomReservationsAndJokeRoomReservations and in which case you might get yourself into a lot of trouble.
>>> 
>>> One could accidentally do it like this:
>>> 
>>> <http://www.csail.mit.edu/#csail> eg:maybeRoomReservations <https://calendar.csail.mit.edu/meetings>.
>>> <http://www.csail.mit.edu/#csail> eg:maybeRoomReservations <https://calendar.csail.mit.edu/test-meetings>.
>>> <https://calendar.csail.mit.edu/test-meetings> a eg:TestSystem.
>>> 
>>> Here, you could SAY in the definition of eg:maybeRoomReservations that clients must check for whether the system is a eg:TestSystem, and let users know.   And here Henry's problem might arise (in reverse).   Dropped triples would mean the user might be told it's real when it's really a test system.  And then, when my important colleagues from across the industry show up for their LDP F2F, it might turn out all the rooms are already allocated to other people, because I accidentally made my reservation on the TEST system. Ooops.
>>> 
>>> So, I don't like lossy paging, but I think this particular kind of problem can (and should) be prevented instead by more careful vocabulary design.  This is monotonicity.   In RDF, in general, we want it so that if you're missing triples, it just means you know *less*, not that you know something *false*.
>> I don't think my example fails on montonicity requirements. The membership triples that are created on a POST
>> are created by the action of POSTing.  This puts those in the same real as what John Austin and later Searle
>> called speech acts: when a priest says that two people are man and wife, he is not making a statement of fact,
>> he is making it true. When you sign a contract you are not stating a truth about a pre-existing condition you
>> are making it true.
>> 
>> http://www.amazon.com/Speech-Acts-Essay-Philosophy-Language/dp/052109626X/ref=sr_1_1?ie=UTF8&qid=1393006510&sr=8-1&keywords=john+searle+speech+acts
> 
> I don't think speech acts are relevant to the problem here.  I agree POSTing is nicely thought of as a speech act.

It is indeed helpful to think of POSTing as a speech act - in this case it would be a document act - because it
helps locate how a protocol is related to logic. It took around 50-60 years between the first works on logic by 
Frege and the first discovery of the importance of speech acts by J.L Austin's "How to do things with Words"
	http://en.wikipedia.org/wiki/How_to_Do_Things_With_Words
And since we are exactly at the intersection here between logic (RDF) and action ( HTTP ), understanding that it
took time for the two to be acknowledged historically can help explain why some things may not be immediately obvious
in this space with people who may otherwise be extreemly knowledgeable in either logic or pragmatics. 

> 
> The issue with monotonicity is around the client making some kind of default assumption about what will result from POSTing to the container. With proper monotonic modeling, even if random triples are dropped on the way to the client, the worst the client will conclude is "I don't know".   The client should never be able to conclude things that will result in harm (like that this container is safe to post to) the absence of information.   (This is because with monotonicity, the client never concludes anything at all from the absence of information.)

I think there is clearly a misunderstanding of how montonicity works and how it is meant to help one fall in harms way.
If I push your reasoning to its logical conclusion it would follow that were I to know nothing then nothing harmful 
could happen to me. But any expermient of reduction of information by say blindfolding can be used as a counterexample.
If I am blindfolded I will now walk around more assured, but a lot more carefully. Why: because as you say I will quite 
correctly assume that I know nothing about what could happen to me if I make a move.

In the LDPC case the same is true. If you do not know what the ldp:membershipResource,
the ldp:hasMemberRelation, and the ldp:insertedContentRelation are for a given ldp:Container
are then you know that you don't know. And what you don't know is that the following is not
true:

 <> ldp:membershipResource <http://army.us/#usf>;
    ldp:hasMemberRelation army:hasVolunteer;
    ldp:insertedContentRelation dc:author .

Therefore you should NOT POST to that container, until you can dismiss that possibility. :-)


> If POSTING could have consequences, that fact must be communicated to the client in a way the client can't miss, even if triples are dropped.

Well Arnaud has argued that it is specified in the spec that the objects of those relations membershipResource, hasMemberRelation and insertedContentRelation could be anything. Therefore you need to find out what those are before POSTing. I agree with Arnaud, but
your comments here are proof that this should be made a lot clearer.

This is not so surprising:  sign a document has the following consequence:
 1) your scribblings will be on the document
 2) you may have come to bind yourself to a year of military service, an life of blissful marriage, a new car,  a chocolate bar,....

1) will always happen - our equivalnet is ldp:contains
2) depends on document - our equivalent is the LDPC

So you MUST read the LDPC carefully before signing/POSTing

> 
> Fortunately, that fact can be communicated by whatever information led the client to the container in the first place.
> 
> For a particularly high-stakes POST, one could require the POSTed contents to include some information that makes it clear the application designer understands the stakes.

high stakes or low stakes is not really the matter here. What about high frequency trading? Each trade is very low stakes, but billions
of transactions of that order make huge differences.


> 
>     -- Sandro
> 
>>>      -- Sandro
>>> 
>> Social Web Architect
>> http://bblfish.net/

Social Web Architect
http://bblfish.net/
Received on Saturday, 22 February 2014 15:44:35 UTC