Re: remove hydra:Resource and hydra:Class (ISSUE-90) from John Walker on 2015-01-13 (public-hydra@w3.org from January 2015)

From: John Walker <john.walker@semaku.com>
Date: Tue, 13 Jan 2015 14:53:26 +0100 (CET)
To: Ruben Verborgh <ruben.verborgh@ugent.be>, Miel Vander Sande <miel.vandersande@ugent.be>
Cc: public-hydra@w3.org, Markus Lanthaler <markus.lanthaler@gmx.net>
Message-ID: <2056132519.2863586.1421157206603.open-xchange@oxweb03.eigbox.net>
Hi Miel,

> On January 13, 2015 at 10:11 AM Miel Vander Sande <miel.vandersande@ugent.be>
> wrote:
> 
>  In general, I would say an application determines the next action based on
> predicate semantics, rather than types. If a client comes across <x>
> rdfs:seeAlso <y>, its the seeAlso that makes it dereference <y>, not <y>
> itself. Maybe <y> is an identifier now, but a resource later. Whatever
> efficiency you want to provide, it is more explicit to model it like that.
> 
> 

Thinking of this in terms of HTML documents a person is pretty handy at deciding
which links to follow and which links are ads/spam or just not interesting based
on the context the link appears, but but computers less so.

Whilst I agree the predicate semantics would be the primary way to determine
next action, is it useful to be able to indicate some links are more interesting
to follow than others. If a client comes across <x> rdfs:seeAlso <foo> , <bar> ,
<baz> the server might want to give the hint <foo> and <bar> are more useful to
follow than <baz>. Of course that could be achieved by using differents
predicate, but could also be done by indicating the type of the resource being
linked to.

To use Rubens example it might be nice that a crawler receives a document with
35 resources with an HTTP(S) URL and decides none, some or all of the links are
worth dereferencing. Whether they are actually dereferenceable or not is not
know at that point and arguably doesn't make much difference.



John

>  Cheers,
> 
>  Miel
> 
> 
> 
>  On Jan 13, 2015, at 9:51 AM, Ruben Verborgh < ruben.verborgh@ugent.be
> <mailto:ruben.verborgh@ugent.be> > wrote:
> 
> 
>      > > Hi Markus,
> > 
> >      Thanks for summarizing.
> > 
> >      I'll just react on two things that IMHO are not technically correct.
> > 
> > 
> >          > > > Efficiency and performance improvements on the client and
> >          > > > reduced load on
> > >          the server. As Tomasz says
> > > 
> > >          On 12 Jan 2015 at 21:26, Tomasz Pluskiewicz wrote:
> > > 
> > >              > > > > However the reason for hydra:Class as described in
> > >              > > > > the specification
> > > >              is the without guidance the client would have to blindly
> > > > try
> > > >              dereferencing everything.
> > > > 
> > > >          > > >          On 12 Jan 2015 at 22:22, Ruben Verborgh wrote:
> > > 
> > >              > > > > That reason is incorrect.
> > > > 
> > > >          > > >          I don't think so.
> > > 
> > > 
> > >              > > > > A client's need to dereference something is not
> > >              > > > > influenced in any way
> > > >              by marking that something as hydra:Class.
> > > >              Only a mechanism that says something is *not*
> > > > dereferenceable can have
> > > >              such an influence.
> > > > 
> > > >          > > >          It's not about the "need". Think of something
> > > >          > > > simpler, something like a
> > >          crawler. It just follows hyperlinks. It shouldn't have to try to
> > > dereference
> > >          every identifier it finds..
> > > 
> > >      > >      We shouldn't forget that an absence of hydra:Resource does
> > >      > > not mean anything.
> >      To clarify matters, I will discuss such a crawler in two situations.
> > 
> > 
> >      SITUATION 1: WITH HYDRA:RESOURCE
> > 
> >      The crawler receives a document with 35 resources with an HTTP(S) URL.
> >      30 of them are explicitly labeled with type hydra:Resource.
> > 
> >      In order to find out whether the other 5 resources are dereferenceable,
> >      the crawler has to perform 5 GET requests.
> >      3 of them dereference, 2 do not.
> > 
> >      In order to dereference the 30 remaining resources,
> >      the crawler has to perform 30 GET requests.
> >      The 5 others have been dereferenced already.
> > 
> >      Total GET requests: 35. Dereferenced: 33.
> > 
> > 
> >      SITUATION 2: WITHOUT HYDRA:RESOURCE
> > 
> >      The crawler receives a document with 35 resources with an HTTP(S) URL.
> >      None of them have type hydra:Resource.
> > 
> >      In order to find out whether the 35 resources are dereferenceable,
> >      the crawler has to perform 35 GET requests.
> >      By the same action, 33 resources are dereferenced.
> > 
> >      Total GET requests: 35. Dereferenced: 33.
> > 
> > 
> >      Same result in both cases. Note in particular how hydra:Resource
> >      did not prevent us from having to check dereferenceability.
> > 
> > 
> > 
> >          > > > 
> > >              > > > > But then again, what tangible benefit does this give?
> > > > 
> > > >              To dereference it, you must GET it.
> > > >              To check whether it is dereferenceable, you must GET it.
> > > > 
> > > >          > > >          To know whether it's worth (or expected) to
> > > >          > > > being checked.
> > > 
> > >      > > 
> > 
> >          > > > Ever tried to dereference xsd:integer to get its definition?
> > > 
> > >      > >      I cannot tell whether I should by the absence of
> > >      > > hydra:Resource.
> >      hydra:Resource does _not_ solve non-dereferenceable resources,
> >      and, perhaps surprisingly, thus also not even dereferenceable
> > resources.
> >      Indeed, every resource that is _not_ labeled with hydra:Resource,
> >      but still dereferences, is by definition a hydra:Resource.
> > 
> >      So “whether it's worth being checked” cannot be derived
> >      from the presence of hydra:Resource,
> >      as all (HTTP[S]) resources are worth being checked
> >      until indicated otherwise—which hydra:Resource cannot.
> > 
> >      Furthermore, “expected” to be checked is not part of its definition.
> >      Again, everything that is dereferenceable is by definition a
> > hydra:Resource,
> >      even though it's not indicated. Should we then check them all?
> > 
> >      In other words: all cases where hydra:Resource is _not_ mentioned
> >      is actually an omission in the response.
> >      Whether or not a particular server thinks we should follow it
> >      does not affect hydra:Resource-ness in any way,
> >      but only our immediate knowledge of it (in the positive case).
> >      If we want a stronger contract (“expected”), we should create one.
> >      The hydra:Resource notion does not provide meaningful info.
> > 
> > 
> >      I do understand the purpose of hydra:Resource and hydra:Class,
> >      but I strongly feel we have chosen the wrong way of addressing that
> > purpose.
> >      What happens now is that we are defending hydra:Resource and
> > hydra:Class
> >      based on (sometimes incorrectly implied) features we'd miss if they
> > were removed.
> >      It should be the other way round: what are the features we need,
> >      and are hydra:Resource and hydra:Class really the best way to bring
> > them?
> > 
> >      Best,
> > 
> >      Ruben
> > 
> >  >
Received on Tuesday, 13 January 2015 13:53:52 UTC