Re: Keeping the simple case simple (was Re: optimizing container pages serialization to enable streaming) from Henry Story on 2013-11-12 (public-ldp-wg@w3.org from November 2013)

From: Henry Story <henry.story@bblfish.net>
Date: Tue, 12 Nov 2013 15:49:31 +0100
To: Arnaud LeHors <lehors@us.ibm.com>
Cc: Linked Data Platform WG <public-ldp-wg@w3.org>
Message-Id: <9A0661A7-6ED3-4BE1-B22B-A6CA4061C4D5@bblfish.net>
On 12 Nov 2013, at 00:13, Arnaud Le Hors <lehors@us.ibm.com> wrote:

> What we need to focus on is on the use of a blank node to group the different predicates associated with membership and whether it does allow us to reintroduce defaults without creating a non-monotonicity issue. 
> The fact that sending that info to the client early on rather late would be advantageous for streaming is just a side note that I propose to leave aside for now. 
> 
> I'll give some background for those who need it. If you know what non-monotonicity is about and how we got where we are you may skip that part. 
> 
> == Background == 
> 
> The specification we started from only had membershipSubject and membershipPredicate and stated that you only needed to specify them if they were different from the container and rdfs:member respectively. This had the advantage of keeping the simple case simple: 
> 
> <http://example.org/container1>
>   a ldp:Container;
>   dcterms:title "A very simple container";
>   rdfs:member
>      <http://example.org/container1/member1>,
>      <http://example.org/container1/member2>,
>      <http://example.org/container1/member3>. 
> 
> Note that neither membershipSubject nor membershipPredicate are set explicitly. The default values are used. 
> 
> Two things changed: 
> 1. We added two more predicates: membershipPredicateInverse and membershipObject. 
> 2. Henry pointed out that the use of default values created a non-monotonicity issue 
> 
> The non-monotonicity issue is this: In an open world, you can't assume to know what you don't know. Therefore you can't assume that membershipPredicate is rdfs:member because you haven't seen/you don't know what membershipPredicate is. 
> 
> Practically speaking this problem exhibits itself particularly when you're streaming, which is why Eric talked about it, but more generally when you start processing data and then discover additional data that impacts your understanding of the "world". 
> 
> Imagine you start processing the above and assume membershipPredicate is rdfs:member. You would then process member1, member2, member3 as being members of the container. But on receiving new data you get: 
> 
> <http://example.org/container1> 
>         ldp:membershipPredicate o:asset. 
> 
> This contradicts your assumption that the membershipPredicate is rdfs:member and invalidates all your prior processing. With that additional info, based on what you now know, you can no longer say that member1, member2, member3 are members of the container. This is the non-monotonicity issue. 
> 
> To solve this issue we agreed not to use default values. Unfortunately the combination with #1 means the simple case is no longer so simple because one has to specify all predicates all the time. This gives us what we now have in the spec, where everything has to be specified: 
> 
> <>
>   a ldp:Container;
>   ldp:membershipSubject <> ;
>   ldp:membershipPredicate rdfs:member;
>   ldp:membershipObject ldp:MemberSubject;
>   dcterms:title "A very simple container";
>   rdfs:member <member1>, <member2>, <member3>. 

Thanks for this very good summary. I think there are some other issues too, but this does make a very clear 
case.


> == Question at hand == 
> 
> Henry proposed to use a blank node, described below by Eric, stating that we can then have default values without non-monotonicity. 
> 
> > To Henry's point, we could [...] enable defaults by
> > adding a ldp:membershipRule along with a change to 5.3.1 (and tweaks
> > to the pending mods to Membership triples):
> >   s[[
> > 5.3.1 The representation of a LDPC MUST contain a set of membership
> > triples following one of the consistent patterns from that
> > definition. The membership-constant-URI of the triples MAY be the
> > container itself or MAY be another resource (as in the example).  See
> > also 5.2.3.
> > ]][[
> > 5.3.1 The representation of a LDPC MUST contain exactly one
> > ldp:membershipRules statement with the subject of the
> > membership-constant-URI and an object a blank node. All membership
> > triples use this blank node as the subject. The
> > membership-constant-URI of the triples MAY be the container itself or
> > MAY be another resource (as in the example).  See also 5.2.3.
> > ]]
> 
> I believe this does indeed address our problem. And the difference with the previous situation we were in is that you don't know what the membershipPredicate and Co are until you receive ldp:membershipRule but once you receive that node you know everything there is to know. 
> 
> In the simple case you can then have: 
> 
> <http://example.org/container1>
>   a ldp:Container;
>   dcterms:title "A very simple container"; 
>    ldp:membershipRule [];
>   rdfs:member
>      <http://example.org/container1/member1>,
>      <http://example.org/container1/member2>,
>      <http://example.org/container1/member3>. 
> 
> An empty blank node means: all default values are in use. 
> Any membership related predicate can then be specified when needed, leaving the unspecified ones with their default values: 
> 
> <http://example.org/netWorth/nw1/assetContainer>
>   a ldp:Container;
>   ldp:membershipRule [ ldp:membershipSubject <http://example.org/netWorth/nw1>;
>                                    ldp:membershipPredicate o:asset. ].
> 
> <http://example.org/netWorth/nw1>
>    a o:NetWorth;
>   o:asset
>      <http://example.org/netWorth/nw1/assetContainer/a1>,
>      <http://example.org/netWorth/nw1/assetContainer/a2>.
> 
> This does achieve the goal of keeping the simple case simple. 
> 
> Again, the one "downside" with this is the use of blank node. Some WG members have said they don't have a problem with it but the WG has in the past decided against relying on the use of blank nodes so this would be a change in our position. 
> 
> Note that this is orthogonal to 1) ldp:created, 2) the names used for each predicate, 3) what the default values are. Please, don't get into any of these here. 

Thanks for this summary Arnaud. I now understand what you thought I had been intending.

Though my proposal may solve the problem, it does not solve it for the reasons you point 
out. Getting the reasons right is quite important here as it has consequences on the 
progression of the work here.

So a quick reason why your argument won't work: the blank nodes in RDF don't group triples.
Once the turtle is parsed the client can never see the difference between a [] blank node
and a _:b453 blank node. This means that the client would be in exactly the same position,
namely from the point of view of a client the 

<container> some:relation member1, member2;
    ldp:membershipRule _:b234 .
....
(pages and pages later)
....
_:b234 ldp:membershipSubject <#something>
       ldp:membershipPredicate o:paper;
       ldp:membershipObject owl:sameAs .

----------------
The introduction of the blank nodes was probably a mistake on my part.
I was trying to jump a bit too far in one go.
---------------

The issue is really with how one infers ldp:member from the membership predicates.
I go into detail with this here:
  http://www.w3.org/2012/ldp/wiki/MembershipInferencing

In my proposal "2.2 think in terms of causality" there is no
inferencing needed from the membership triples to the ldp:member 
relations. Because:

1. We publish all the ldp:member relations
2. we change the ldp:membershipXXX relations to ldp:creationXXX
relations. 

the ldp:creationXXX relation specify a causal consequence of posting.
They never remove the causal consequence of POST creating an ldp:member.
They just add new consequences.  As a result there is a default behaviour
without monotinicity problems.

A client could read a whole file and find all the ldp:members . If at the end it
finds the ldp:membershipXXX relations it will know what the consequence of 
POSTing to  the container is. But it won't have an issue with any of the
information it found.

I hope this helps. What I am trying to do here is show how there is something
valuable in the membershipXXX relations. It's just we need to think of them
not as inferencial statements, but as stating causal consequences of creation.
That makes a lot of the rest much simpler.

Henry

> 
> Thanks. 
> --
> Arnaud  Le Hors - Software Standards Architect - IBM Software Group

Social Web Architect
http://bblfish.net/
Received on Tuesday, 12 November 2013 14:50:04 UTC