Re: 2NN Contents Of Related (303 Shortcut) from Roy T. Fielding on 2014-09-05 (ietf-http-wg@w3.org from July to September 2014)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 4 Sep 2014 18:14:04 -0700
To: Sandro Hawke <sandro@w3.org>
Cc: Martin Thomson <martin.thomson@gmail.com>, Eric Prud'hommeaux <eric@w3.org>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, "Julian F. Reschke" <julian.reschke@gmx.de>
Message-Id: <D658C555-A220-42ED-B593-BE7C2D4C2C2C@gbiv.com>
On Sep 4, 2014, at 3:02 PM, Sandro Hawke wrote:

> On 09/04/2014 01:50 PM, Martin Thomson wrote:
>> On 2 September 2014 08:00, Eric Prud'hommeaux <eric@w3.org> wrote:
>>> We could ask questions like "Is /Index?page=1 a representation of
>>> /Index ?" and "What is the subject of the metadata in a 200+CL, the
>>> effective request URI or the CL?" The end result of these is that we
>>> evaluate the use cases for 303.
>> I think that saying we end up re-evaluating the need for 303 is
>> drawing a pretty long bow.
>> 
>> Why don't we talk instead about semantics.  What semantic distinction
>> are you looking to make?  There's a functional pattern you are looking
>> to enable (request this, get that instead, don't pay extra round
>> trips), but that pattern is supported by 200+CL.
>> 
> 
> It's a good question, and I'm not sure we have a great answer. Mostly, we want it to be possible for there to be semantic distinctions.   We're building infrastructure, not applications, so the distinctions are likely to be made at other layers.

Given resources A and B,

  GET A -> 200 OK, CL: B
    implies that the payload is both a representation of A and of B
    (assuming the origin server is authoritative for both).

  GET A -> 303 See Other, Location: B
    implies that we don't have a representation of A, but B is interesting
    too so you might want to go over and get that if you haven't already.

  GET A -> 2NN Related, CL: B
    implies that we don't have a representation of A, but we know what you
    really wanted (better than you) so here is a representation of B.

Now, here's the problem:

"It's a round trip short cut!"  No, because it won't be cached.
303 round trips to the same server are almost entirely free in HTTP/1.1
because of persistent connections, so what you are really short-cutting
here is the chance for the user's cache to discover it already has a
representation of that other resource they didn't actually request and
might not even be interested in retrieving.

"Oh, but we know the user always wants that other resource because this
is a semantic web system!"  Then do yourself a favor and use templated
assertions to define links between those semanticky resources and the
descriptions the user really wanted in the first place, and just skip the
first request entirely.

   urn:world:{stuff} -> describedBy ("http://encyclopedia/query?{stuff}")

"Oh, but this isn't *just* a semantic web system -- we expect this to be
implemented by the entire Web!"  Well, then you don't know that the user
really wants to get a 2NN instead of a 303, and thus you are favoring
your own club of implementations over the general performance value
(and benefit to everyone on the Internet) of reusing cached representations.
Maybe that's when Prefer should be used instead.

> Some possible distinctions that come to mind:
> 
> - search engines / indexing services -- these systems index which URLs provide content containing particular data items.   These systems are indexed by the URL; should they index the request URL or the CL?

I doubt that search engines index content of arbitrary status codes.
They do index the destinations of redirects, and they typically associate
aliased content with the most-linked URL.  2NN doesn't help at all.

> - endorsement -- what we now see in social systems as Like/+1/star -- where the user sees something and gives it some kind of mark of approval.  Is that mark on just the first page of items, or the whole set?    Which URL should be considered endoresed?  It's possible the user will be frustrated, or worse, if they meant to mark one and the other was considered marked.

2NN is not going to help you there.  You can't tell if they like a picture,
the person in the picture, the dog the person is holding in the picture,
or the sweater on the dog the person is holding in the picture.  The user
doesn't care either way -- they just want the owner to like them back.
The only ones who really care are the advertisers trying to figure out
which of those things the user might actually be interested in buying.

> - link rel=alternate -- is that an alternate for this page or the whole thing?   Some alternates might be paged differently, so maybe it doesn't make sense for the page.

Link relations are supposed to define what the relation applies to.
The response isn't going to make any difference.

> - link rel=copyright -- if the different items have different copyright, there might be multiple of these links to cover them all, and it will be different depending whether this is talking about the paged resource or just a page

Copyright links point to a description of the copyright for an expression
(the selected representation), but that description might encompass an
entire site of resources (e.g., it might be a database of copyright info).

> - link rel=next/prev -- at first glance these obviously are about the CL not the requested resources, but what if the requested resource were itself in some kind of sequence?    Or what if the redirect were for some reason other than paging?

They are defined as navigation links.  What does paging have to do with
any of this?  Whether "/Index?page=1" and "/Index" are the same or
(more likely) different resources is not going to be discoverable by
one look at their representations.  The response will be 200, regardless.

I have no doubt that we'll end up with a 2NN code anyway, because it is
easier to mint new codes than to explain why servers shouldn't use them.

....Roy
Received on Friday, 5 September 2014 01:14:28 UTC