Re: How to avoid that collections "break" relationships from Jason Douglas on 2014-03-25 (public-vocabs@w3.org from March 2014)

From: Jason Douglas <jasondouglas@google.com>
Date: Tue, 25 Mar 2014 18:05:01 +0000
To: gregg@greggkellogg.net, pfpschneider@gmail.com
Cc: markus.lanthaler@gmx.net, public-hydra@w3.org, public-lod@w3.org, public-vocabs@w3.org
Message-ID: <CAEiKvUAk2+SgGaEdpp42UZjF6MMWuvxDqBHAcpN3Njv=PKO-eQ@mail.gmail.com>
Well-summarized, Gregg.

On Tue Mar 25 2014 at 10:46:43 AM, Gregg Kellogg <gregg@greggkellogg.net>
wrote:

> Hi Peter,
>
> On Mar 25, 2014, at 9:49 AM, Peter F. Patel-Schneider <
> pfpschneider@gmail.com> wrote:
>
> > Let's see if I have this right.
> >
> > You are encountering a situation where thenumber of people Markus knows
> is too big (somehow).  The proposed solution is to move this information to
> a separate location. I don't see how this helps in reducing the size of the
> information, which was the initial problem.
>
> From my perspective, this is really a clash between the notions of the use
> of URIs in RDF to denote entities, and relative URIs in many REST
> applications to denote relationships. In my experience, a RESTful web
> application may use a URI relative to an entity's location as a way to
> access related entities; this is a common pattern in Ruby on Rails. For
> example:
>
> http://example/users/1
>
> In many systems, this would be served by a controller where _1_ is taken
> to be a primary key for a related SQL table, in this case a Users table. If
> users are joined together using a many-to-many relationship, a convention I
> can use in my application is to construct a "route", such as the following:
>
> http://example/users/1/knows/
>
> Which might be semantically equivalent (within the application logic) to
> http://example/knows?user_id=1. The controller may then query the join
> table where one column (say src_id) is _1_, so that results find related
> entities based on another column in the join table (say dest_id). The
> application may then return all records in a single request, or a subset of
> those records through pagination.
>
> Many developers will want to be able to publish information about their
> datasets using a vocabulary such as schema.org. Given that an entity may
> contain many relationships, it is not feasiable to create a single entity
> description with all of the members of these relationships enumerated. For
> example, a User entity may have parents, children, friends (knows), likes,
> comments, photos, ... Moreover, these relationships are bi-directional (a
> user asserts a knows relationship with another user, and is known by other
> users). In a prototypical Rails application, this works because a page
> rendered for a user contains controls to access these relationships. How
> does the developer of such an application capture these semantics using
> something like schema.org? As it stands in Hydra now, these relationships
> might be described as follows:
>
> <.../markus/> a schema:Person;
>   schema:knows <.../markus/knows>;
>   ...
>
> However, as markus points out, the <../markus/knows> resource likely
> returns a collection, rather than a person. This isn't a show-stopper for
> schema.org, because schema:knows does not use rdfs:range, but
> schema:rangeIncludes, which does not cause an inference that
> <.../markus/knows> is a schema:Person, but the same logic should work for
> something such as FOAF, where it would create such a contradiction.
>
> The challenge for a developer is to come up with entity markup that has a
> good chance of being understood for SEO purposes, and does not create so
> high a conceptual barrier for the developer that they just don't attempt
> it. I think it is our responsibility to provide best practices for marking
> up entities used in such applications in a simple way that does not clash
> with RDF expectations, where any URI used in the range of schema:knows is
> expected to be a person and not a collection.
>
> > Splitting this information into pieces might help. schema.org, along
> with just about every other RDF syntax, doesnot require that all the
> information about a particular entity is in the same spot. The problem then
> is to ensure that all the information is accessed together.
> >
> > schema.org, somewhat separate from other RDF syntaxes, does have
> facilities for this.  All you need to do is to set up multiple pages, for
> example
> > .../markus1 through.../markusn
> > and on each of these pages include schema.org markup withcontent like
> > <.../markusi> schema:url <.../markus>
> > <.../markus> schema:knows <.../friendi1>
> > ...
> > <.../markus> schema:knows <.../friendimi>
> > Then on .../markus you have
> > <.../markus> schema:url <.../markus1>
> > ...
> > <.../markus> schema:url <.../markusn>
> > (Maybe schema:sameAs is a better relationshipto use here, but they both
> should work.)
> >
> > Voila! (With the big provisio that I have no idea whether the schema.orgprocessors actually dothe right thing here, asthere is no indication of
> what they do do.)
>
> The problem is, that if this is to drive application logic, as is the
> intent of Hydra, how to know what URI to dereference if you're interested
> in schema:knows, or schema:children, or schema:parent, or schema:comment,
> or whatever the interesting relationship is?
>
> I think there are two ways out of this:
>
> 1) schema.org can break the relationship expectation model by
> specifically allowing, say, an ItemList to be the value of any property
> with the intent that it provide such an indirection, and damn the RDF
> consequences.
>

There's a similar debate going on about Roles.  For example, allowing both
a simple actor property on a movie vs. hasRole --> <role/123> --> actor.
 The mediation would be for attaching additional properties like
'character'.

I suppose collections could do something similar like:
hasCollection --> <markus/friends> --> knows --> <friend1>

As you say, it's a "damn the RDF consequences" approach, but given that
we're already in rangeIncludes land and it's possible to write a reasonable
processor for that pattern, maybe it's the lesser of the evils.


> 2) use something like an operation, that describes these relationships,
> but has less of a chance of being useful for SEO. For example:
>
> <../markus/> a foaf:Person
>  hydra:supportedOperation [
>    a GetRelatedCollectionOperation;
>    hydra:title "Get known relations";
>    hydra:description "Retrieves a collection of foaf:Person related to the
> subject through foaf:knows";
>    hydra:property foaf:knows;
>    hydra:uri <../markus/knows>;
>    hydra:method "GET";
>    hydra:returns foaf:Person
>  ] .
>
> Gregg
>
> > peter
> >
> > PS:  LDP??
> >
> > On 03/24/2014 08:24 AM, Markus Lanthaler wrote:
> >> Hi all,
> >>
> >> We have an interesting discussion in the Hydra W3C Community Group [1]
> >> regarding collections and would like to hear more opinions and ideas.
> I'm
> >> sure this is an issue a lot of Linked Data applications face in
> practice.
> >>
> >> Let's assume we want to build a Web API that exposes information about
> >> persons and their friends. Using schema.org, your data would look
> somewhat
> >> like this:
> >>
> >>   </markus> a schema:Person ;
> >>             schema:knows </alice> ;
> >>             ...
> >>             schema:knows </zorro> .
> >>
> >> All this information would be available in the document at /markus
> (please
> >> let's not talk about hash URLs etc. here, ok?). Depending on the number
> of
> >> friends, the document however may grow too large. Web APIs typically
> solve
> >> that by introducing an intermediary (paged) resource such as
> >> /markus/friends/. In Schema.org we have ItemList to do so:
> >>
> >>   </markus> a schema:Person ;
> >>             schema:knows </markus/friends/> .
> >>
> >>   </markus/friends/> a schema:ItemList ;
> >>             schema:itemListElement </alice> ;
> >>             ...
> >>             schema: itemListElement </zorro> .
> >>
> >> This works, but has two problems:
> >>   1) it breaks the /markus --[knows]--> /alice relationship
> >>   2) it says that /markus --[knows]--> /markus/friends
> >>
> >> While 1) can easily be fixed, 2) is much trickier--especially if we
> consider
> >> cases that don't use schema.org with its "weak semantics" but a
> vocabulary
> >> that uses rdfs:range, such as FOAF. In that case, the statement
> >>
> >>   </markus> foaf:knows </markus/friends/> .
> >>
> >> and the fact that
> >>
> >>   foaf:knows rdfs:range foaf:Person .
> >>
> >> would yield to the "wrong" inference that /markus/friends is a
> foaf:Person.
> >>
> >> How do you deal with such cases?
> >>
> >> How is schema.org intended to be used in cases like these? Is the
> above use
> >> of ItemList sensible or is this something that should better be avoided?
> >>
> >>
> >> Thanks,
> >> Markus
> >>
> >>
> >> P.S.: I'm aware of how LDP handles this issue, but, while I generally
> like
> >> the approach it takes, I don't like that fact that it imposes a specific
> >> interaction model.
> >>
> >>
> >> [1] http://bit.ly/HydraCG
> >>
> >>
> >>
> >> --
> >> Markus Lanthaler
> >> @markuslanthaler
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>
>
Received on Tuesday, 25 March 2014 18:05:34 UTC