RE: Feedback for draft-nottingham-http-link-header-03 from Williams, Stuart (HP Labs, Bristol) on 2008-12-03 (ietf-http-wg@w3.org from October to December 2008)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Wed, 3 Dec 2008 18:03:24 +0000
To: Eran Hammer-Lahav <eran@hueniverse.com>
CC: "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Message-ID: <233101CD2D78D64E8C6691E90030E5C8245CDEAECF@GVW1120EXC.americas.hpqcorp.net>
Hello Eran,

Responses in line below.

> -----Original Message-----
> From: Eran Hammer-Lahav [mailto:eran@hueniverse.com]
> Sent: 03 December 2008 16:47
> To: Williams, Stuart (HP Labs, Bristol)
> Subject: RE: Feedback for draft-nottingham-http-link-header-03
>
> > From: Williams, Stuart (HP Labs, Bristol) [mailto:skw@hp.com]
> > Sent: Wednesday, December 03, 2008 7:20 AM
> >
> > Hello Mark,
> >
> > [just dipping into the achive and found this between you and Eran]
> >
> > > >>   The context of links conveyed in the Link header field is the
> > > >>   representation that the header is part of.
> > > >
> > > > This makes sense since the header is provided in the context of the
> > > > representation. However, is there a way to indicate that a link is
> > > > persistent across representations and is not representation-
> > > > specific? Do 404 and 303 considered representations?
> > >
> > > *sigh* this is the tricky bit; HTTPbis has at least one open issue on
> > > this. I believe the current position is that all messages have
> > > entities, and all entities are representations, the trick being that
> > > the representation isn't always of the resource which the request was
> > > sent to; sometimes it's an "anonymous" representation.
> >
> > Oh dear... I think we need to be clear about the whether the relations
> > being conveyed in Link: headers are intended to hold between resources
> > or between the conveyed representation and whatever the target URI
> > refers to.
>
> The current draft is pretty clear that the link is between
> the conveyed representation of the resource, but what is not
> clear is to what.

[As I read toward the end I see that we are largely in agreement about how we think things should be - however I started on this response thinking that we were more in disagreement - in particular that you thought that at least one end of a link is grounded in a representation - it is evident later that you don't think that is how things should be.]

I have to say that I do not get such clarity from the draft. From the get-go it seems to be about relations between pairs of resources and not between some ephemeral representation of a resource and some (usually) other resource.

eg:
"1.  Introduction

   A means of indicating the relationships between documents on the Web,
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   as well as indicating the type of those relationships, has been
   available for some time in HTML [W3C.REC-html401-19991224], and more
   recently in Atom [RFC4287].  These mechanisms, although conceptually
   similar, are separate.  However, links between resources need not be
                                    ^^^^^^^^^^^^^^^^^^^^^^^
   format-specific; it can be useful to have typed links that are
   independent of the format, especially when a resource has
   representations in multiple formats."

There may be resources that repeatedly emit 'identical' representations or select from the same set of equivalent representation (conneg) but I truely hope that what are speaking about is relations or links between resources.

The draft heads off into vagueness when it starts to speak of "contex of use". I think that the simple fix would be to say that the requested resource *is* the context of use or indeed to drop the notion of "context of use" and just speak of plainly about the relation being between the requested resource and the link target - much more direct and less fluffy.

> Let's assume 'type' is not present in a
> Link header. This means the on one side of the relationship
> we have a representation while on the other side we have a
> resource (as no representation is specified). This gets more
> complex if we bring 'type' back in.

The HTTP request contains a resource reference; the link target contains a resource reference. The link: stated relation or rev is claimed to exist between those two resources (modulo handling of "Content-Location:"). What could be simpler.

You'll see my concerns about 'type' expressed in my question on "metadata-discovery@googlegroups". It's clear to me that you, as I, think that its role is advisory - otherwise were have a much more complicated situation where there relation becomes a cross-product of two identifiers - the rel or rev and the type. Honestly, I think that 'type' should be depricated from the Link: header spec - though I'm sure that others would disagree.

>
> My current understanding of 'type' is nothing more than a
> hint. Something to help user-agents better choose the links
> that serve their needs. Because one resource has no authority
> to define the content type of another, this 'hint'
> interpretation is appropriate. But when considering this
> discussion, 'type' might very well turn out to define the
> representation of the linked resource the relationship is between.

Well on the whole web architecture does not give you the granularity to speak about specific resource representations (either by time or by format). The trick it pulls is to allow generic resources to conneg to more specific resources (eg. http://www.w3.org/Icons/w3c_home connegs to .gif or .png or ... and reports that it has done so in a Content-Location: header). In this way we can avoid thinking of representations having URIs (they don't - sure you could assign one so that you could talk about it - but you'd have to promote it to being a resource if you wanted to make it accessible) - you assign different URI to the different conneg'd variant *resources*.


> For example, the response to GET /resource/1:
>
> Content-type: text/html
> Link: <http://example.com/resource/2>; rel="next"; type="text/html"
>
> Can mean, this is a link between:
>
> 1. /resourse/1 and /resource/2
yes

> 2. the text/html representation of /resource/1 and (any representation of) /resource/2
I hope not.

> 3. the text/html representation of /resource/1 and the text/html representation of /resource/2
Likewise, I hope not

> If /resource/2 does not support the text/html representation,
> then #3 means a broken link, just as if /resource/2 would
> return a 404 with no additional useful information.

It is IMO both correct and much simpler to regard the relation as holding between resources and push representations down as just an artifact that exposes a view on the state of a resource.

> > Personnally, my hope is that the relations are regarded as holding
> > between the resources that are referenced by URIs (as opposed to one
> > party to the relation being a more ephemeral representation or message
> > being conveyed).
>
> I agree both on simplicity grounds but also because once
> links are attached to the context of specific
> representations, it opens up the can of worms above, where
> the link may or may not be symmetric (one side representation
> and the other resource). And as demonstrated above, it put
> the meaning of 'type' in question.

Ok... I think we're probably in loud agreement.

> > Of course the question is what resource supplies the context,
> > particularly in the case of a response that carries a "Content-
> > Location:" header that carries a different URI from that in the
> > request.
> >
> > It seems to me that candidate choices for the context resource are:
> >
> > a) the resource referenced in the request line of the corresponding
> > HTTP request.
> > b) the resource referenced in a Content-Location: header returned in
> > the same response.
> >
> > are there any other candidates?
> >
> > Does one take a) as the default and allow b) if present to override a),
> > or simply stick with the original request URI.
> >
> > Simplicity suggests just a) and that if you want to find out about
> > links associated with the resource reference by a "Content-Location:"
> > then you go ask there.
>
> I think a) is the right choice and not just for simplicity.
> The current response has no authority over the link
> relationships of the other resource identified by the
> Content-location header.

Ah... one could view the conneg action as having made a hidden redirection to the specific resource and that he given response originates from the more specific resource as opposed to the requested resource. I have checked through the details - so it might be possible to find cases that allow us to conclude that that would be a flawed thing to think - eg a header listing the available representation formats that is consistent with the requested resource, but would clearly be a superset of what the more specific resource could provide.

So I think b) could be justified - but I think just a) because how else would you be able to ensure that the requested resource is the subject(rel) or object(rev) of a link assertion - particularly if your aim is to find metadata about that subject resource. The b) view would repeatedly mask the requested resource as the subject/object of link relations and you'd only be able to find out about its close friends.

>
> EHL
>

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Wednesday, 3 December 2008 18:07:51 UTC