Re: Issue-89, proposal 3: Duplication of triples & inferencing from Steve Speicher on 2013-12-13 (public-ldp-wg@w3.org from December 2013)

From: Steve Speicher <sspeiche@gmail.com>
Date: Fri, 13 Dec 2013 16:04:17 -0500
To: Henry Story <henry.story@bblfish.net>
Cc: Arnaud LeHors <lehors@us.ibm.com>, Linked Data Platform WG <public-ldp-wg@w3.org>
Message-ID: <CAOUJ7JpQfiq9GcAb8D3f0DdCVO0Ph119U4_KYAhkUBRxiCzY6w@mail.gmail.com>
Henry,

On Fri, Dec 13, 2013 at 3:24 PM, Henry Story <henry.story@bblfish.net>
wrote:
>
>
> On 13 Dec 2013, at 20:11, Arnaud Le Hors <lehors@us.ibm.com> wrote:
>
> > I agree the primer should be updated to take advantage of the new
containers but I disagree that what you're doing is just that.
> > Roger very clearly pointed out why what you consider "a much more
elegant fashion" misses the mark.
> > The use case at hand is all about allowing people to define a container
by leveraging their existing data structure. Telling them they need to
restructure their data *is* arguing against the use case.
>
> > So, no, there is no point in continuing this discussion. We can't
afford to keep rehashing the same points over and over. This is just a
waste of time.
>
> Arnaud,
>
>    I understand the difficult position you are in, in trying to get this
group
> through the final stage of this 2 year marathon. Many other groups have
failed
> to keep to their deadline, often through trying to deal with more
complexity
> than was necessary. Keeping things simple is the golden rule to success
in this
> field.
>
>   The complexity in this group has clearly arisen through the
introduction of the
> ldp:memberXXX relations, which I am still trying to get to the bottom of.
> Sometimes it happens that ideas that seem good at first are not as good
as they seemed
> after using them.  What I am doing here is to try to get to the
underlying need
> for them.
>
>   After looking at Rogers example I showed that it could be reduced
elegantly
> to an ldp:SimpleContainer, ( which doesn not need the ldp:memberXXX
relation)
> without the need for reasoning, and without loss of information. I brought
> this up because you have often worried about simplicity of this type.
>
> It seems that some people need to
>
>    "define a container by leveraging their existing
>     data structure".
>
> but I have no idea what this means. Could someone explain
> this to me?
>
>
> Is it that the use case tied to 2.1 of the primer requires the data to be
written
> out in the exact way it is written out in primer? Clearly that cannot be,
since the
> example in the primer is out of sync with the spec. What is the use case
data base
> that needs to be leveraged? Why does it require data to be written out in
exactly
> the way explained in the Primer, when in fact there is a simpler way to
do that
> using the ldp:SimpleContainer as I showed?
>
> I understand that you wish to make the group progress to the final call,
but
> I don't see how cutting short on discussions like this is going to make
things
> better. If we found out that ldp:SimpleContainer was all that is needed
then
> one would have a simpler spec, and it would be easier to go to last call.
>
> Mind you I did not start out arguing for just ldp:SimpleContainer. It is
just
> that in the course of this conversation it seemed that in example 2.1 of
the
> spec nothing more is needed. Perhaps all other examples can be reduced
this
> way? If not it would be good to know exactly why not.
>
>  I hope this helps put the previous discussion into perspective, as one
> that is in line with your aims.

I was trying not to start a discussion on what is good or bad example in
the primer but trying to leverage the basic idea.

To be very clear, I believe the only thing we are discussing here is that
when a server provides a ldp:DirectContainer representation, whether it
MUST include in that representation all the triples that have ldp:xyz in
them.  Which would double the number of "membership triples", from spec
definition.

We have deployed a number of products that follow a pattern on how they
expose resources from a container/collection thing.  I think we've
established that.  To boil down to the simplest form, it really looks like:

 <bugs/> a bt:BugCollection;
      bt:hasBugs <1>, ...., <300000>.

This is it.  My tools know how to operate on this, they know how to index
it and we have various ways to query on it.

It works much like the DirectContainer.  For my server to comply with
ldp:DirectContainer it would only have to add to the representation these
triples:

<bugs/> a bt:BugCollection, ldp:DirectContainer;
    ldp:containerResource <bugs/>;
    ldp:containsRelation bt:hasBug;
    bt:hasBugs <1>, ...., <300000>.

Only adds 3 triples.  If I were to be forced to emit ldp:xyz, I would need
to add 300,003 triples [1]. Which is unnecessary in this case since you can
determine ldp:xyz from.  I have not heard a convincing reason I why we
should force this in the specification.

>From what I see:
We all agree on ldp:SimpleContainer -- done
We generally agree on ldp:DirectContainer approach, though we differ in
which way triples should be inferred (if I can say inferencing).  I don't
even have a problem saying it can work in either direction, if you omit
ldp:xyz it can be inferred...if you omit domain supplied membership
predicate then ldp:xyz can be inferred.

- Steve

[1] -
https://jazz.net/products/rational-team-concert/whats-happening#activity
(Information about hosted product with over 300,000 bug entries)


>
>
>
> Henry
>
> > --
> > Arnaud  Le Hors - Software Standards Architect - IBM Software Group
> >
>
> [1]
https://dvcs.w3.org/hg/ldpwg/raw-file/tip/ldp-primer/ldp-primer.html#navandret
> [2]
https://dvcs.w3.org/hg/ldpwg/raw-file/default/ldp-ucr.html#system-and-software-development-tool-integration
>
> >
> > Henry Story <henry.story@bblfish.net> wrote on 12/13/2013 10:58:14 AM:
> >
> > > From: Henry Story <henry.story@bblfish.net>
> > > To: Arnaud Le Hors/Cupertino/IBM@IBMUS,
> > > Cc: Linked Data Platform WG <public-ldp-wg@w3.org>, sysreq@w3.org
> > > Date: 12/13/2013 10:59 AM
> > > Subject: Re: Issue-89, proposal 3: Duplication of triples &
inferencing
> > >
> > >
> > > On 13 Dec 2013, at 17:41, Arnaud Le Hors <lehors@us.ibm.com> wrote:
> > >
> > > > Thanks Roger for expressing yourself.
> > > >
> > > > I have to say that I'm surprised by Henry's take on this. First,
> > > the example being discussed isn't new at all. We've been talking
> > > about this from the very beginning. Again, it is very disruptive to
> > > start questioning these so late in the game.
> > >
> > > Things have changed in the spec. We recently introduced the
> > > ldp:SimpleContainer, ldp:DirectContainer, and
> > > the ldp:IndirectContainer. The examples in the primer need to be
> > > adjusted for this new reality. That's all
> > > I am doing.
> > >
> > > > Second, I don't think there is much future in arguing that other
> > > people's use cases aren't valid on the basis that their data isn't
> > > structured properly or any other reasons.
> > >
> > > You misunderstand me. I am not arguing about the validity of the use
> > > case at all.
> > > In fact I showed that the same use case can be achieved with the newly
> > > voted ldp:SimpleContainer in a much more elegant fashion.
> > > Is this not what a Primer should do?
> > >
> > > > People have different needs and the only way forward is to accept
> > > that and try to address them equally rather than disqualify them.
> > > >
> > > > For these reasons I don't see any value in entertaining such a
discussion.
> > >
> > > Good so given that I was not trying to disqualify a use case, but
> > > show how one could use
> > > the tools we have to improve the implementation of it, can we
> > > continue the discussion?
> > >
> > >
> > > Henry
> > >
> > > > --
> > > > Arnaud  Le Hors - Software Standards Architect - IBM Software Group
> > > >
> > > >
> > > >
> > > >
> > > > From:        Roger Menday <roger.menday@uk.fujitsu.com>
> > > > To:        Henry Story <henry.story@bblfish.net>,
> > > > Cc:        Steve K Speicher <sspeiche@gmail.com>, Arnaud Le Hors/
> > > Cupertino/IBM@IBMUS, Linked Data Platform WG <public-ldp-wg@w3.org>
> > > > Date:        12/13/2013 07:59 AM
> > > > Subject:        Re: Issue-89, proposal 3: Duplication of triples &
> > > inferencing
> > > >
> > > >
> > > >
> > > >
> > > > Henry,
> > > >
> > > > Whilst the path between a product and a bug is probably something
like:
> > > >
> > > >    product --hasbug--> bug
> > > >
> > > > ... you are saying:
> > > >
> > > >    product  --bugreportcollection-->  bugcollection
 --ldpcontains-->  bug
> > > >
> > > > I have a number of problems with this.
> > > > For a start, this becomes more difficult to query.
> > > >
> > > > thanks,
> > > > Roger
> > > >
> > > >
> > > >
> > > > On 12 Dec 2013, at 22:28, Henry Story wrote:
> > > >
> > > > On 12 Dec 2013, at 21:26, Steve Speicher <sspeiche@gmail.com> wrote:
> > > >
> > > > Hi Henry,
> > > >
> > > > Let me try to reiterate the use case we've discussed.
> > > >
> > > > On Thu, Dec 12, 2013 at 2:01 PM, Henry Story
> > > <henry.story@bblfish.net> wrote:
> > > > >
> > > > >
> > > > > On 12 Dec 2013, at 19:20, Arnaud Le Hors <lehors@us.ibm.com>
wrote:
> > > > >
> > > > > > While true, it's been pointed out before, several times, that
> > > this would fall short of addressing the use case at hand: allowing
> > > one to define a container over existing data by leveraging a domain
> > > specific vocabulary.
> > > > >
> > > > > I am not sure I understand. The use case is I suppose that one
should
> > > > > be able to publish existing data using LDP. It can't be a
requirement
> > > > > to publish the data in an LDPC in particular.
> > > > >
> > > > > It seems obvious that one can publish any data in an LDPR ( that
is not
> > > > > an LDPC of course ). So the use case is satisfied anyway.
> > > > >
> > > > > Can anyone explain in particular why the data MUST be placed in an
> > > > > LDPC?
> > > >
> > > > Because that how my model structured being part of a container-
> > > like structure, prior to LDP spec, and I want to apply LDP to it.  I
> > > shouldn't need to setup a LDPC to the side of my model but apply to
> > > it.  Take example 6 from the primer[1], it is an example of this.
> > > >
> > > > Example 6 from the primer is very badly modelled. It is confusing
> > > the LDPContainer with a product.
> > > > Unless the product is the container itself, in which case it is a
> > > very odd container that has so many bugs.
> > > > (I would not use a container that has so many bugs).
> > > >
> > > > Presumably the  product should be something other than the
> > > container. It would then need
> > > > another LDPG that can describe that product, in order to make it
> > > easy for a client to edit
> > > > ( using PATCH ) without getting confused about relations that are
> > > managed by the container
> > > > - such as ldp:xyz, a.k.a. ldp:contains relations - and those that
> > > are relations about the product
> > > > ( such as its size, its date of creation, its owner, its price,
> > > etc, and that may be managed by human
> > > > managers ).
> > > >
> > > > So  we could have the following resources
> > > >
> > > > </app/product1>           <-   the document about the product
> > > > </app/product1#v1>     <-   the real product
> > > > </app/product1/bugs/> <-   the bugs about the product
> > > >
> > > > You don't even need anything more than a ldp:SimpleContainer as
> > > > I can show below:
> > > >
> > > > The product description can be found with the following
> > > >
> > > > [[
> > > > GET /app/product1
> > > >
> > > > <#v1> a
> > > >     dc:title "Semantic Web For the Working Ontologist";
> > > >     shop:price "38"^^currency:dollars;
> > > >     bt:bugReportCollection <bugs/> .
> > > > ]]
> > > >
> > > > The bugs collection linked from the product can also be found by
following
> > > > your nose from above:
> > > >
> > > > [[
> > > > GET /app/product1/bugs/
> > > >
> > > > <../product1> bt:bugReportCollection <> .
> > > > <> a ldp:SimpleCollection;
> > > >      ldp:contains <bugReport1>, <bugReport2>, <bugReport3> .
> > > > ]]
> > > >
> > > >
> > > > • Looking at that  </app/product1/bugs/> collection one can find
> > > what product is the subject of the collection.
> > > > • POSTing to /app/product1/bugs creates new bug reports
> > > > • DELETEing a bug report is the understood way to remove it from
the LDPC
> > > > • There is no duplication of triples here anywhere
> > > >
> > > >
> > > > Additional context is that there are a number of data sources that
> > > expose similar models (servers emitting Linked Data resources).  So
> > > basically it has membership predicate bt:hasBug, you can infer
> > > ldp:xyz from it.  Using the approach Henry outlined below it
> > > opposite what I need.  I already know the membership triples.
> > > >
> > > > If I map this to programming languages this seems like what you
> > > are saying is that a List or a Set is not enough
> > > > to represent a collection, that you need the relations in each
> > > list to be a different type of relation.
> > > >
> > > > But usually in programming languages one models such things by
> > > creating an object with an attribute to a collection
> > > > of things. Eg:
> > > >
> > > > {
> > > >   name: "Semantic Web For the Working Ontologist";
> > > >   bugs: [ bug1, bug2, bug3 ];
> > > > }
> > > >
> > > > where the collection [ bug1, bug2, bug3 ] here is a list, where
> > > the relation from one member of the list
> > > > to the next is always the same relation ( e.g.: rdf:first, rdf:next
).
> > > >
> > > > Here for some reason you seem to be requiring that each collection
> > > have a different
> > > > type of relation, but that the subject of the collection be an
> > > LDPC and the object be an LDPR.
> > > > There are so many other ways of modelling this that seem to be
> > > better, and that would
> > > > make the specification simpler, and the work of clients easier.
> > > >
> > > >
> > > > We've already cover this use case quite a bit.  I believe for
> > > DirectContainers, the ldp:xyz could be inferred for those clients
> > > that need it.  Instead of requiring that burden on all servers to
> > > explicitly produce these (c, ldp:xyz, mr) triples that are very
> > > simple for those clients that need it to produce it.
> > > >
> > > > I think I show above that you get what you want without the need
> > > for DirectContainers, without the need for duplicationg relations,
> > > > and without needing clients to do odd inferencing.
> > > >
> > > >
> > > > [1] -
https://dvcs.w3.org/hg/ldpwg/raw-file/tip/ldp-primer/ldp-primer.html
> > > >
> > > >
> > > >
> > > >
> > > > - Steve Speicher
> > > >
> > > > >
> > > > >
> > > > > > It's this new relationship that should be inferred.
> > > > > > --
> > > > > > Arnaud  Le Hors - Software Standards Architect - IBM Software
Group
> > > > > >
> > > > > >
> > > > > > Henry Story <henry.story@bblfish.net> wrote on 12/12/2013
09:27:28 AM:
> > > > > >
> > > > > > > From: Henry Story <henry.story@bblfish.net>
> > > > > > > To: Linked Data Platform WG <public-ldp-wg@w3.org>,
> > > > > > > Date: 12/12/2013 09:31 AM
> > > > > > > Subject: Issue-89, proposal 3: Duplication of triples &
inferencing
> > > > > > >
> > > > > > > Part 3 of Issue-89 creates a relation
ldp:propertiesOnlyResource
> > > > > > > to allow an LDPC to point in its header to the "membership
> > > properties".
> > > > > > > The reason for this is to avoid so called duplication of
triples.
> > > > > > >
> > > > > > > The duplication of triples is an issue mostly for the
> > > > > > > ldp:DirectContainer as is visible for a container such
> > > > > > > as the following
> > > > > > >
> > > > > > > <> a ldp:DirectContainer;
> > > > > > >         ldp:containerResource <>;
> > > > > > >         ldp:containsRelation m:manages;
> > > > > > >     ldp:xyz <doc1>, <doc2>, <doc3> ;
> > > > > > >     m:manages <doc1>, <doc2>, <doc3> .
> > > > > > >
> > > > > > > ( I am using ldp:xyz for what alexander in ISSUE-89 calls
> > > > > > >   ldp:contains. You can replace it without loss here and
> > > > > > >   throughout this e-mail. )
> > > > > > >
> > > > > > > But according to the rule such as the one used in the
> > > Membership wiki [1]
> > > > > > > it would be very easy to determine the "membership triples"
using only
> > > > > > > the ldp:xyz relations
> > > > > > >
> > > > > > > PREFIX ldp: <http://www.w3.org/ns/ldp#>
> > > > > > >
> > > > > > > CONSTRUCT { ?subject ?predicate ?object }
> > > > > > > WHERE {
> > > > > > >    ?ldpc a ldp:DirectContainer;
> > > > > > >         ldp:containerResource ?subject;
> > > > > > >         ldp:containsRelation ?predicate;
> > > > > > >
> > > > > > >    ?ldpc ldp:xyz ?document .
> > > > > > >    BIND (?document AS ?object)                             #
the
> > > > > > > POSTed resource is the member
> > > > > > >  } UNION {
> > > > > > >    ?ldpc a ldp:DirectContainer;
> > > > > > >         ldp:containerResource ?object;
> > > > > > >         ldp:containedByRelation ?predicate.
#
> > > > > > > ldp:containedByRelation is used
> > > > > > >
> > > > > > >    ?ldpc ldp:xyz ?document .
> > > > > > >    BIND (?document AS ?object)
> > > > > > >  }
> > > > > > > }
> > > > > > >
> > > > > > > In that case duplication is not a problem at all,
> > > > > > > since a client could just infer the "membership triples"
> > > > > > > from the ldp:xyz ones using that query.
> > > > > > >
> > > > > > > On the other hand if such a rule is not true, and cannot
> > > > > > > be written out, then there is no duplication, since the
> > > > > > > "membership triples" are in fact different triples, and
> > > > > > > have no necessary relation to the ldp:xyz ones.
> > > > > > >
> > > > > > > But then this does give one a good reason for having them in a
> > > > > > > different possibly server managed resource.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > [1]in the Membership wiki "Determining the membership
 triples to be
> > > > > > > added when a new member
> > > > > > > is created"  http://www.w3.org/2012/ldp/wiki/
> > > > > > >
> > >
Membership#Determining_the_membership_triples_to_be_added_when_a_new_member_is_created
> >
>
> Social Web Architect
> http://bblfish.net/
>
>
Received on Friday, 13 December 2013 21:04:45 UTC