Re: How to avoid that collections "break" relationships (ISSUE-41) from Ruben Verborgh on 2014-06-02 (public-hydra@w3.org from June 2014)

From: Ruben Verborgh <ruben.verborgh@ugent.be>
Date: Mon, 2 Jun 2014 10:17:50 +0200
To: Markus Lanthaler <markus.lanthaler@gmx.net>
Cc: public-hydra@w3.org
Message-Id: <E5A83C06-8686-483B-8694-B77E74252E05@ugent.be>
Hi Markus,

As you've asked for my feedback on this issue before,
below are my thoughts on the wiki page (for which I thank you again).
It's written from the perspective of clients,
i.e., what things clients would be able to do as a result of the chosen solution.
Additionally, I'll comment on things from my Semantic Web background.

TL;DR: linking through generic properties is the way to go.


LINK TO THE COLLECTION VIA A GENERIC PROPERTY

    </alice> schema:knows </alice/friends/>.

rdfs:seeAlso is too generic and doesn't mean anything.
So clients cannot conclude anything from this;
it doesn't really add any functionality.

foaf:topic in that regard is not helpful;
i.e., I would say that
    </alice/friends/> foaf:topic </alice>.
but NOT that
    </alice/friends/> foaf:topic schema:knows.

hydra:hasCollection with "property" and "subject" seems better;
it allows clients to derive that (all?) "allice :knows" triples will be there.
However, "hasCollection" is unnecessary in that case,
because "subject" already gives the needed information.
Hence, the following is sufficient (1):
    </alice/friends> :manages [
        :subject </alice>;
        :property schema:knows;
    ].
(where "manages" might not be the best term).
However, do we need the extra "manages" indirection?
Can't we just add the properties directly to </alice/friends>?
Benefit of "manages": on thing could manage multiple other things.

In that regards, the "hasRelationShipIndirection" works too (2):
    </alice> :hasRelationshipIndirection </alice/friends>.
    </alice/friends> :property schema:knows.

The benefit of option (1) is that it is subject-centered.
It says "this is a document with these properties",
which seems easier to reuse in other contexts than a chain (2).
On the other hand, (2) will show up if you enumerate all properties of </alice>.
(But hey, we can equally enumerate all properties where </alice> is object.)

The first contra about "bigger payloads" doesn't really seem an issue.
The second contra, well… that's what libraries are for. It's not _that_ hard.



USE OF A BLANK NODE COLLECTION MEMBER TO INDIRECTLY POINT TO THE COLLECTION

    </alice> schema:knows [ hydra:isMemberOf </alice/friends/> ].    (3)

From the client perspective, it allows to fiend friends of Alice.
However, the semantics are not explicit about this.
It could be that

    </alice> schema:knows [ hydra:isMemberOf </NationalSoccerTeamOfBrazil> ].

In all fairness, the second would not be as "interesting" to include in the response.

Extra con: this breaks if </alice/friends> has no members,
because (3) implies there is at least one member.

About the con "introduces an undesired triple": let's not forget it is an open world.
If you see:
    </alice> schema:knows </bob>, </conny>, </dennis>, </edna>.
How many friends does Alice have?
Only correct answer: "at least one",
because those 4 URIs could point to the same resource.



USE OF A SEPARATE PROPERTY TO REFERENCE COLLECTIONS

This unnecessarily creates a lot of properties.
I'm not in favor of this solution.
Clients that can interpret "foaf:knows"
should now find a way to interpret an additional property.

(Sidenote: it does emphasize how difficult it is to talk about collections in RDF.
RDF was really designed for individuals.
I would love if RDF 2.0 or something extends the model with collections, like so:
    </alice> foaf:knows </bob>.
    </alice> *foaf:knows </alice/friends>.
But until that happens, let's not invent *foaf:knows ourselves.)

As far as the "use plural properties names" is concerned: no.
URIs on the Web are opaque, end of story.
Clients should not attempt dangerous things.


USE OF AN OPERATION WITH AN EXPLICITLY DEFINED TARGET

No. We should be describing the declarative semantics here,
not the operational semantics, for similar reasons including this one:
operational can be inferred from declarative, but not the other way round.
The only thing clients can do here is GET the thing,
but nothing more interesting. We shouldn't write the scenarios for clients.




MY PERSONAL CONCLUSION

It is my opinion that these two solutions are the best:

    </alice/friends> :manages [
        :subject </alice>;
        :property schema:knows;
    ].

    </alice> :hasRelationshipIndirection </alice/friends>.
    </alice/friends> :property schema:knows.

Both allow clients to find the friends easily
based on a property that they are interested in.

I'm in favor of the first, as the blank node there
describes a resource that is reusable and extensible.
However, in both cases, we have to find the optimal terminology.
In the first case, this comes down to identifying what the blank node is.

Terminology suggestion (just to help us think):
    </alice/friends> :isCollectionOf [
        rdf:type :CollectionItemTemplate;
        :subject </alice>;
        :property schema:knows;
    ].
(Note that the rdf:type could be hidden.)

This would also account for cases where </alice> is the object:
    </alice/followers> :isCollectionOf [
        rdf:type :CollectionItemTemplate;
        :property schema:follows;
        :object </alice>;
    ].

The semantics would then mandate be that:
- for a given template, minimum 1 and maximum 2 components should be used
- the collection contains all items of the dataset that match the template (possibly paged)

Best,

Ruben
Received on Monday, 2 June 2014 08:18:24 UTC