W3C home > Mailing lists > Public > public-vocabs@w3.org > March 2014

Re: How to avoid that collections "break" relationships

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Tue, 25 Mar 2014 11:34:15 -0700
Cc: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Markus Lanthaler <markus.lanthaler@gmx.net>, public-hydra@w3.org, public-lod@w3.org, public-vocabs@w3.org
Message-Id: <1246AED7-A04E-4909-AF56-66E94172819F@greggkellogg.net>
To: Jason Douglas <jasondouglas@google.com>
On Mar 25, 2014, at 11:05 AM, Jason Douglas <jasondouglas@google.com> wrote:

> Well-summarized, Gregg.
> 
> On Tue Mar 25 2014 at 10:46:43 AM, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> Hi Peter,
> 
> On Mar 25, 2014, at 9:49 AM, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> > Let's see if I have this right.
> >
> > You are encountering a situation where thenumber of people Markus knows is too big (somehow).  The proposed solution is to move this information to a separate location. I don't see how this helps in reducing the size of the information, which was the initial problem.
> 
> From my perspective, this is really a clash between the notions of the use of URIs in RDF to denote entities, and relative URIs in many REST applications to denote relationships. In my experience, a RESTful web application may use a URI relative to an entity's location as a way to access related entities; this is a common pattern in Ruby on Rails. For example:
> 
> http://example/users/1
> 
> In many systems, this would be served by a controller where _1_ is taken to be a primary key for a related SQL table, in this case a Users table. If users are joined together using a many-to-many relationship, a convention I can use in my application is to construct a "route", such as the following:
> 
> http://example/users/1/knows/
> 
> Which might be semantically equivalent (within the application logic) to http://example/knows?user_id=1. The controller may then query the join table where one column (say src_id) is _1_, so that results find related entities based on another column in the join table (say dest_id). The application may then return all records in a single request, or a subset of those records through pagination.
> 
> Many developers will want to be able to publish information about their datasets using a vocabulary such as schema.org. Given that an entity may contain many relationships, it is not feasiable to create a single entity description with all of the members of these relationships enumerated. For example, a User entity may have parents, children, friends (knows), likes, comments, photos, ... Moreover, these relationships are bi-directional (a user asserts a knows relationship with another user, and is known by other users). In a prototypical Rails application, this works because a page rendered for a user contains controls to access these relationships. How does the developer of such an application capture these semantics using something like schema.org? As it stands in Hydra now, these relationships might be described as follows:
> 
> <.../markus/> a schema:Person;
>   schema:knows <.../markus/knows>;
>   ...
> 
> However, as markus points out, the <../markus/knows> resource likely returns a collection, rather than a person. This isn't a show-stopper for schema.org, because schema:knows does not use rdfs:range, but schema:rangeIncludes, which does not cause an inference that <.../markus/knows> is a schema:Person, but the same logic should work for something such as FOAF, where it would create such a contradiction.
> 
> The challenge for a developer is to come up with entity markup that has a good chance of being understood for SEO purposes, and does not create so high a conceptual barrier for the developer that they just don't attempt it. I think it is our responsibility to provide best practices for marking up entities used in such applications in a simple way that does not clash with RDF expectations, where any URI used in the range of schema:knows is expected to be a person and not a collection.
> 
> > Splitting this information into pieces might help. schema.org, along with just about every other RDF syntax, doesnot require that all the information about a particular entity is in the same spot. The problem then is to ensure that all the information is accessed together.
> >
> > schema.org, somewhat separate from other RDF syntaxes, does have facilities for this.  All you need to do is to set up multiple pages, for example
> > .../markus1 through.../markusn
> > and on each of these pages include schema.org markup withcontent like
> > <.../markusi> schema:url <.../markus>
> > <.../markus> schema:knows <.../friendi1>
> > ...
> > <.../markus> schema:knows <.../friendimi>
> > Then on .../markus you have
> > <.../markus> schema:url <.../markus1>
> > ...
> > <.../markus> schema:url <.../markusn>
> > (Maybe schema:sameAs is a better relationshipto use here, but they both should work.)
> >
> > Voila! (With the big provisio that I have no idea whether the schema.org processors actually dothe right thing here, asthere is no indication of what they do do.)
> 
> The problem is, that if this is to drive application logic, as is the intent of Hydra, how to know what URI to dereference if you're interested in schema:knows, or schema:children, or schema:parent, or schema:comment, or whatever the interesting relationship is?
> 
> I think there are two ways out of this:
> 
> 1) schema.org can break the relationship expectation model by specifically allowing, say, an ItemList to be the value of any property with the intent that it provide such an indirection, and damn the RDF consequences.
> 
> There's a similar debate going on about Roles.  For example, allowing both a simple actor property on a movie vs. hasRole --> <role/123> --> actor.  The mediation would be for attaching additional properties like 'character'.

This sounds like discussion about adding something like schema:contribution, which would be used instead of schema:actor. Perhaps you have a similar notion where a schema:actor could somehow reference an intermediate object, similar to the collection here, which would ultimately access the actor. I wonder if there are many other non-container cases where this pattern fits?

> I suppose collections could do something similar like:
> hasCollection --> <markus/friends> --> knows --> <friend1>

I think it's necessary to provide some more insight into the hasCollection property, so that if it were used with a URI value in a Person entity, you'd know that it was a collection of schema:Person. Speaking just of the schema.org case right now, an extension such as schema:seeAlso/knows or schema:knows/collection could be defined to have these semantics. In this sense, it's similar to Pat's suggestion (or my own, where I suggested (ab)using the equivalent plural and singular versions, of say schema:colleague and schema:colleagues).

<../markus> a schema:Person;
  schema:knows/collection <../markus/knows>;

redirects to <../markus/knows?page=1>

<../markus/knows?page=1> a hydra:Collection;
  hydra:member <../gregg>, ...

You could also assert

<../markus> schema:knows <../gregg>

in the collection; easier to do with JSON-LD or RDFa.

Unfortunately, this doesn't extend to an arbitrary vocabulary without minting a new collection predicate for each multiply-valued object property. But, it seems like the closest thing to preserving the fingerprint of an entity description. The supportedOperation below is more accurate, but probably impractical.

Gregg

> As you say, it's a "damn the RDF consequences" approach, but given that we're already in rangeIncludes land and it's possible to write a reasonable processor for that pattern, maybe it's the lesser of the evils.
>  
> 2) use something like an operation, that describes these relationships, but has less of a chance of being useful for SEO. For example:
> 
> <../markus/> a foaf:Person
>  hydra:supportedOperation [
>    a GetRelatedCollectionOperation;
>    hydra:title "Get known relations";
>    hydra:description "Retrieves a collection of foaf:Person related to the subject through foaf:knows";
>    hydra:property foaf:knows;
>    hydra:uri <../markus/knows>;
>    hydra:method "GET";
>    hydra:returns foaf:Person
>  ] .
> 
> Gregg
> 
> > peter
> >
> > PS:  LDP??
> >
> > On 03/24/2014 08:24 AM, Markus Lanthaler wrote:
> >> Hi all,
> >>
> >> We have an interesting discussion in the Hydra W3C Community Group [1]
> >> regarding collections and would like to hear more opinions and ideas. I'm
> >> sure this is an issue a lot of Linked Data applications face in practice.
> >>
> >> Let's assume we want to build a Web API that exposes information about
> >> persons and their friends. Using schema.org, your data would look somewhat
> >> like this:
> >>
> >>   </markus> a schema:Person ;
> >>             schema:knows </alice> ;
> >>             ...
> >>             schema:knows </zorro> .
> >>
> >> All this information would be available in the document at /markus (please
> >> let's not talk about hash URLs etc. here, ok?). Depending on the number of
> >> friends, the document however may grow too large. Web APIs typically solve
> >> that by introducing an intermediary (paged) resource such as
> >> /markus/friends/. In Schema.org we have ItemList to do so:
> >>
> >>   </markus> a schema:Person ;
> >>             schema:knows </markus/friends/> .
> >>
> >>   </markus/friends/> a schema:ItemList ;
> >>             schema:itemListElement </alice> ;
> >>             ...
> >>             schema: itemListElement </zorro> .
> >>
> >> This works, but has two problems:
> >>   1) it breaks the /markus --[knows]--> /alice relationship
> >>   2) it says that /markus --[knows]--> /markus/friends
> >>
> >> While 1) can easily be fixed, 2) is much trickier--especially if we consider
> >> cases that don't use schema.org with its "weak semantics" but a vocabulary
> >> that uses rdfs:range, such as FOAF. In that case, the statement
> >>
> >>   </markus> foaf:knows </markus/friends/> .
> >>
> >> and the fact that
> >>
> >>   foaf:knows rdfs:range foaf:Person .
> >>
> >> would yield to the "wrong" inference that /markus/friends is a foaf:Person.
> >>
> >> How do you deal with such cases?
> >>
> >> How is schema.org intended to be used in cases like these? Is the above use
> >> of ItemList sensible or is this something that should better be avoided?
> >>
> >>
> >> Thanks,
> >> Markus
> >>
> >>
> >> P.S.: I'm aware of how LDP handles this issue, but, while I generally like
> >> the approach it takes, I don't like that fact that it imposes a specific
> >> interaction model.
> >>
> >>
> >> [1] http://bit.ly/HydraCG
> >>
> >>
> >>
> >> --
> >> Markus Lanthaler
> >> @markuslanthaler
> >>
> >>
> >>
> >>
> >>
> >
> >
> 
> 
Received on Tuesday, 25 March 2014 18:34:48 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:29:38 UTC