Re: Keeping the simple case simple (was Re: optimizing container pages serialization to enable streaming) from Eric Prud'hommeaux on 2013-11-13 (public-ldp-wg@w3.org from November 2013)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 12 Nov 2013 19:37:54 -0500
To: Henry Story <henry.story@bblfish.net>
Cc: Arnaud LeHors <lehors@us.ibm.com>, Linked Data Platform WG <public-ldp-wg@w3.org>
Message-ID: <20131113003751.GB25765@w3.org>
* Henry Story <henry.story@bblfish.net> [2013-11-12 15:49+0100]
> 
> On 12 Nov 2013, at 00:13, Arnaud Le Hors <lehors@us.ibm.com> wrote:
> 
> > What we need to focus on is on the use of a blank node to group the different predicates associated with membership and whether it does allow us to reintroduce defaults without creating a non-monotonicity issue. 
> > The fact that sending that info to the client early on rather late would be advantageous for streaming is just a side note that I propose to leave aside for now. 
> > 
> > I'll give some background for those who need it. If you know what non-monotonicity is about and how we got where we are you may skip that part. 
> > 
> > == Background == 
> > 
> > The specification we started from only had membershipSubject and membershipPredicate and stated that you only needed to specify them if they were different from the container and rdfs:member respectively. This had the advantage of keeping the simple case simple: 
> > 
> > <http://example.org/container1>
> >   a ldp:Container;
> >   dcterms:title "A very simple container";
> >   rdfs:member
> >      <http://example.org/container1/member1>,
> >      <http://example.org/container1/member2>,
> >      <http://example.org/container1/member3>. 
> > 
> > Note that neither membershipSubject nor membershipPredicate are set explicitly. The default values are used. 
> > 
> > Two things changed: 
> > 1. We added two more predicates: membershipPredicateInverse and membershipObject. 
> > 2. Henry pointed out that the use of default values created a non-monotonicity issue 
> > 
> > The non-monotonicity issue is this: In an open world, you can't assume to know what you don't know. Therefore you can't assume that membershipPredicate is rdfs:member because you haven't seen/you don't know what membershipPredicate is. 
> > 
> > Practically speaking this problem exhibits itself particularly when you're streaming, which is why Eric talked about it, but more generally when you start processing data and then discover additional data that impacts your understanding of the "world". 
> 
> > Imagine you start processing the above and assume membershipPredicate is rdfs:member. You would then process member1, member2, member3 as being members of the container. But on receiving new data you get: 
> > 
> > <http://example.org/container1> 
> >         ldp:membershipPredicate o:asset. 
> > 
> > This contradicts your assumption that the membershipPredicate is rdfs:member and invalidates all your prior processing. With that additional info, based on what you now know, you can no longer say that member1, member2, member3 are members of the container. This is the non-monotonicity issue. 

Other than for streaming, monotonicity doesn't impact LDP. Even for streaming, it's not really monotonicity, it's just a cardinality constraint. If I know I'll see a predicate zero or one time, I can't guess a value until I see that predicate or get to the end of the document. A client can stream as soon as it sees certain predicates so it makes sense to utter them early on. Henry's cute lexical trick of using anonymous bnodes allowed us to know after seeing a particular triple that we were going to see no more container triples.


> > To solve this issue we agreed not to use default values. Unfortunately the combination with #1 means the simple case is no longer so simple because one has to specify all predicates all the time. This gives us what we now have in the spec, where everything has to be specified: 
> > 
> > <>
> >   a ldp:Container;
> >   ldp:membershipSubject <> ;
> >   ldp:membershipPredicate rdfs:member;
> >   ldp:membershipObject ldp:MemberSubject;
> >   dcterms:title "A very simple container";
> >   rdfs:member <member1>, <member2>, <member3>. 
> 
> Thanks for this very good summary. I think there are some other issues too, but this does make a very clear 
> case.
> 
> 
> > == Question at hand == 
> > 
> > Henry proposed to use a blank node, described below by Eric, stating that we can then have default values without non-monotonicity. 
> > 
> > > To Henry's point, we could [...] enable defaults by
> > > adding a ldp:membershipRule along with a change to 5.3.1 (and tweaks
> > > to the pending mods to Membership triples):
> > >   s[[
> > > 5.3.1 The representation of a LDPC MUST contain a set of membership
> > > triples following one of the consistent patterns from that
> > > definition. The membership-constant-URI of the triples MAY be the
> > > container itself or MAY be another resource (as in the example).  See
> > > also 5.2.3.
> > > ]][[
> > > 5.3.1 The representation of a LDPC MUST contain exactly one
> > > ldp:membershipRules statement with the subject of the
> > > membership-constant-URI and an object a blank node. All membership
> > > triples use this blank node as the subject. The
> > > membership-constant-URI of the triples MAY be the container itself or
> > > MAY be another resource (as in the example).  See also 5.2.3.
> > > ]]
> > 
> > I believe this does indeed address our problem. And the difference with the previous situation we were in is that you don't know what the membershipPredicate and Co are until you receive ldp:membershipRule but once you receive that node you know everything there is to know. 
> > 
> > In the simple case you can then have: 
> > 
> > <http://example.org/container1>
> >   a ldp:Container;
> >   dcterms:title "A very simple container"; 
> >    ldp:membershipRule [];
> >   rdfs:member
> >      <http://example.org/container1/member1>,
> >      <http://example.org/container1/member2>,
> >      <http://example.org/container1/member3>. 
> > 
> > An empty blank node means: all default values are in use. 
> > Any membership related predicate can then be specified when needed, leaving the unspecified ones with their default values: 
> > 
> > <http://example.org/netWorth/nw1/assetContainer>
> >   a ldp:Container;
> >   ldp:membershipRule [ ldp:membershipSubject <http://example.org/netWorth/nw1>;
> >                                    ldp:membershipPredicate o:asset. ].
> > 
> > <http://example.org/netWorth/nw1>
> >    a o:NetWorth;
> >   o:asset
> >      <http://example.org/netWorth/nw1/assetContainer/a1>,
> >      <http://example.org/netWorth/nw1/assetContainer/a2>.
> > 
> > This does achieve the goal of keeping the simple case simple. 
> > 
> > Again, the one "downside" with this is the use of blank node. Some WG members have said they don't have a problem with it but the WG has in the past decided against relying on the use of blank nodes so this would be a change in our position. 
> > 
> > Note that this is orthogonal to 1) ldp:created, 2) the names used for each predicate, 3) what the default values are. Please, don't get into any of these here. 
> 
> Thanks for this summary Arnaud. I now understand what you thought I had been intending.
> 
> Though my proposal may solve the problem, it does not solve it for the reasons you point 
> out. Getting the reasons right is quite important here as it has consequences on the 
> progression of the work here.
> 
> So a quick reason why your argument won't work: the blank nodes in RDF don't group triples.
> Once the turtle is parsed the client can never see the difference between a [] blank node
> and a _:b453 blank node. This means that the client would be in exactly the same position,
> namely from the point of view of a client the 
> 
> <container> some:relation member1, member2;
>     ldp:membershipRule _:b234 .
> ....
> (pages and pages later)
> ....
> _:b234 ldp:membershipSubject <#something>
>        ldp:membershipPredicate o:paper;
>        ldp:membershipObject owl:sameAs .
> 
> ----------------
> The introduction of the blank nodes was probably a mistake on my part.
> I was trying to jump a bit too far in one go.
> ---------------

Let's clarify that to say "anonymous blank nodes". This approach will
work in Turtle, Trig and RDF/XML. It will not work in ntriples or
nquads which have no anonymous bnode syntax. Do we want to mildly
tweak the schema for something which only works for the higher-level
RDF serializations? I'd say yes; the cost is low; possible benefit high.


> The issue is really with how one infers ldp:member from the membership predicates.
> I go into detail with this here:
>   http://www.w3.org/2012/ldp/wiki/MembershipInferencing
> 
> In my proposal "2.2 think in terms of causality" there is no
> inferencing needed from the membership triples to the ldp:member 
> relations. Because:
> 
> 1. We publish all the ldp:member relations
> 2. we change the ldp:membershipXXX relations to ldp:creationXXX
> relations. 
> 
> the ldp:creationXXX relation specify a causal consequence of posting.
> They never remove the causal consequence of POST creating an ldp:member.
> They just add new consequences.  As a result there is a default behaviour
> without monotinicity problems.
> 
> A client could read a whole file and find all the ldp:members . If at the end it
> finds the ldp:membershipXXX relations it will know what the consequence of 
> POSTing to  the container is. But it won't have an issue with any of the
> information it found.
> 
> I hope this helps. What I am trying to do here is show how there is something
> valuable in the membershipXXX relations. It's just we need to think of them
> not as inferencial statements, but as stating causal consequences of creation.
> That makes a lot of the rest much simpler.
> 
> Henry
> 
> > 
> > Thanks. 
> > --
> > Arnaud  Le Hors - Software Standards Architect - IBM Software Group
> 
> Social Web Architect
> http://bblfish.net/
> 
> 

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Wednesday, 13 November 2013 00:38:26 UTC