RE: httpRange-14: Consequences of redirection from Williams, Stuart (HP Labs, Bristol) on 2007-11-30 (www-tag@w3.org from November 2007)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Fri, 30 Nov 2007 15:55:33 +0000
To: Tore Eriksson <tore.eriksson@gmail.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <9674EA156DA93A4F855379AABDA4A5C604FC842331@G5W0277.americas.hpqcorp.net>
Hello Tore,

Firstly, I think there is probably quite a bit of common ground here (at least between us). There are also quite a few thing on which we're likely to continue disagree - but there are at least some were agreeing in unneccessary (eg. whether or not client libraries are at fault - I will make one last pass on that topic and have little more to add beyond that).

> -----Original Message-----
> From: Tore Eriksson [mailto:tore.eriksson@gmail.com]
> Sent: 30 November 2007 14:19
> To: Williams, Stuart (HP Labs, Bristol)
> Cc: www-tag@w3.org
> Subject: Re: httpRange-14: Consequences of redirection
>
> Hi again Stuart,
>
> [While writing this it seems like you have respondend to my
> response... It am probably repeating myself here and ignoring
> some comments you have already made but I hope you don't mind
> me sending it anyway.]

<snip/>

> Stuart Williams said:
> > The Content-Location header is use in content negotiation
> > to indicate the URI of a specific variant of the resource.
> > For example, the W3C logo is identified/denoted by the URI
> > http://www.w3.org/Icons/w3c_home. An attempt to access
> > (record in the wget debug log below) returns a .png
> > representation of the logo and a Content-Location: which
> > indicates where that particular variant of the logo may be
> > obtained from in the future. A subsequent attempt to retrieve
> > a jpeg variant reveals that only .png and .gig variants are available.
> >
> > Anyway, AIUI, Content-Location: as used in content
> > negotiation provides a way to identify a more specific
> > variant of a generic resource; a way of identifying a
> > resources that provides access to some specific subset of the
> > representations available from the generic reference.
> >
> >         http://www.w3.org/Icons/w3c_home        denotes the W3C Icon (a particular graphic/image)
> >         http://www.w3.org/Icons/w3_home.png     denotes a specific variant of the W3C Icon which provides only
> >                                                 image/png representations
> >
> > Both URI denote resources which stand in variantOf(w3c_home.png, w3c_home) relation.
> >
>
> If you don't mind I would like to restate this using another image:
>
> <http://www.w3.org/Icons/SW/sw-horz-w3c>
>    denotes the Semantic Web Icon (a particular resource that has rdf:type foaf:Image?)
> <http://www.w3.org/Icons/SW/sw-horz-w3c.png>
>    denotes a resource that is a specific variant of the Semantic Web Icon which
>    provides only image/png representations

OK...

> Can you really say that these resources stand in
> variantOf(sw-horz-w3c.png, sw-horz-w3c) relation?

Yes... I think so. I admit that I do take some license wrt to the language of 2616, repeating the quote you used earlier:

        14.14 Content-Location

           The Content-Location entity-header field MAY be used to supply the
           resource location for the entity enclosed in the message when that
           entity is accessible from a location separate from the requested
           resource's URI.

The awkward phrase I find there is  "...resource location for the entity enclosed in the message..." which reads as binding the Content-Location URI to an entity (representation like thing) as opposed to bind it to a variant resource which (in some sense) would provide the same/an identical 'entity' (or representation).

I say "in some sense" because one reading would make the "enclosed entity" invariant while another would (roughly) make just the content-type invariant.

I would also agree that this is tricky, because *if* I take <../sw-horz-w3c> as denoting an Image (as you and I both do), then <../sw-horz-w3.png> either denotes the same image or an identical image. The difference between the two resources is the range of representations they make available at a given instant (the second being a subset of the first).

> When
> reading <http://www.w3.org/2007/10/sw-logos.html>, it seems
> (to me) as if <http://www.w3.org/Icons/SW/sw-horz-w3c> is a
> non-information resource

Oh, that was a surprise... I think that every TAG participant would categorize that image as an information resource. I don't know of one that would not... but of course they may chime in and prove me wrong.

Anyway... from my pov this sets up a false premise for what follows.

> that except for the actual depiction
> also has other characteristics,
> like:
>
> <http://www.w3.org/Icons/SW/sw-horz-w3c>
>     cc:license <http://www.w3.org/Consortium/Legal/2002/copyright-documents-20021231>;
>     cc:license <http://www.w3.org/2007/10/sw-logos.html#LogosWithW3C>;
>     rdf:label "W3C Semantic Web Logo"@en.
>
> Since these are not a part of the png representation (they
> are in the metadata of the svg one though), their relation is
> not variationOf but rather that one "contains a
> description[n]/depiction of the other" - your description of
> the relationship found in a 303 redirect.

I guess I would cast it differently in that different representations can have different levels of fidelity. In this example, the svg reprentation happens to carry metadata which is about the resource itself, whereas the png representation does not. SVG, GIF, PNG... representations are all representation (of varying fidelity) of the icon.

> Further,
> <http://www.w3.org/Icons/SW/sw-horz-w3c> does not "defy"
> representation, the thing is just that the representation
> might be incomplete, due to limitations on what can be
> described by the mime type.

There's no disagreement here (at least between you and I) - I'm not arguing that the image is a not an information resource.

I'd be happy settle on the union of things that "do not happen to have" or "could not possibly have" a representation - and duck on the ontological classification of the referent.
Anyway, this icon is not such a thing.

> According to some, this then
> makes <http://www.w3.org/Icons/SW/sw-horz-w3c> a
> non-"Information resource", but I do not see anything to gain
> from this.

Again I think you are arguing against a false premise...

> But maybe the WWW should do a 303 redirect here,
> what do you think?

No... the icon has representations so make them available an serve them direct.

> Reformulating myself - the question is how
> generic can a generic resource be until it becomes an
> information resource?

Well, we started from an information resource in the first place.

More interesting might be natural language variants - where clearly what is presented is visually and textually different - yet in some sense are all variants of the same abstract work. The French Translation of "American Declartion of Independence" severed as application/pdf. There is perhaps more utility here in the variant resource having a distinguished URI.

> Tim Berners-Lee talks about generic "electronic resources" in [1], but this restriction seems as
> a concept equivalent to and as arbitrary as the concept of an "information resource".
>
> As for my example:
> > > For a given representation received through the HTTP protocol, it
> > > could possibly contain a number of resources (used in a broad sense)
> > > that one would like to make statements about:
> > >
> > > A - The topic of the representation
> >
> > Can we try to stay within the bounds of resource (or thing) which the
> > representation is a representation of (eg. the daily Oaxaca weather
> > report ; representations (an html or postcript... rendering of the
> > daily Oaxaca weather report); and one or more subjects (if any) that
> > the resource may be *about* (eg. today's weather in Oaxaca, Oaxaca,
> > rain, wind, snow...in the Oaxaca area);
> >
> > > B - The textual or pictorial content (a.k.a. "document" or
> > > "information resource" or "conceptual work") C - The bitstream
> > > itself (a.k.a. "document instance")
> >
> > So...
> >         A would be "today's weather in Oaxaca" ie a/the subject of a daily weather report;
> >
> >         B would be "today's daily weather *report* for Oaxaca" (from one of possibly many sources).
> >         B might be conceived of as a particular variant of a generic weather report (RDF, HTML,
> >           PDF, JPEG, GIF...) or as a generic resource whose information content is a weather report.
> >
> >         C would be either a particular occurence of a message transferring B (a 'token') as bitstream
> >         or the 'type' of all messages that carry that particular bit stream.
> >
>
> We agree here:
> A - "today's weather in Oaxaca"
> B - "today's daily weather *report* for Oaxaca" - as a generic resource
>
> I would like to change your C into
> C - a particular occurrence of a message transferring a representation of both A and B as a bit stream

Ok'ish... but I'd argue that A defies representation... so not both, just B.

Pat Hayes has repeatedly reminded us that at least two senses of the word "representation" are being used.
How C 'represents' A is very different from how C 'represents' B.

> Further:
> > > (1) Can one serve a representation of A without giving the
> > > representation a corresponding information resource B?
> > > (2) How to you find <B> when you have <A>?
> >
> > So continuing with the Oaxaca example here. I'd argue that
> > "today's weather in Oaxaca" defies representation; however it
> > can be described (in the form of a weather report - forecast
> > or post-hoc). So, A is conceptual and without representation.
> > B is a(n) (information) resource which describes "today's
> > weather in Oaxaca" (possibly amongst other things). So a
> > redirect (whether protocol induced (303) or local client side
> > (#'d URI)) from <A> to <B> is appropriate.
> >
> > wrt 1) *IF* A has representations... then serve them from <A> with a 200 OK response!
> >     2) Use #'d URIs or protocol redirection map from <A> to <B>
> >
> > > The answer of httpRange-14 to (2) is to do a 303 redirect from <A>
> > > to <B>.
> >
> > Or use #'d URI.
> >
>
> Sure. But I wanted to talk about problems with HTTP
> redirects, so I hope you excuse me for ignoring this alternative.
>
> > > By requiring a redirect, it also disallows responding directly with
> > > a 200 on <A> thus making the creation of <B> compulsory and
> > > consequently answering (1) with a NO.
> >
> > Well... either A in fact has no representations OR by it's
> > very nature defies representation so... 200 would be entirely
> > wrong! *IF* A in fact has representations (ie. they are
> > indeed representations of A (ie. "todays weather in Oaxaca")
> > rather than representations of something else (eg. "a daily
> > Oaxaca weather report for today")) then send respond with 200
> > and a representation.
> >
>
> What people are arguing about is this part that says
> "def[ying] representation". All representations are limited
> by mime type, but that does not mean that serving the
> representation is pointless.

It is a question of what it is that you assert the representation is a representation of.

I think that there is no argument that you want to get to some useful information that either represents or describes/depicts the resource.
But providing a "representation of a thing" and a "representation of a description of a thing" (of which there may be may distinct descriptions) are different things.

> In this case, I would say that
> "today's weather in Oaxaca" has a representation, and that
> the representation overlaps with the one for "a daily Oaxaca
> weather report for today".

I cannnot concur that "today's weather in Oaxaca" has a webarch:Representation.

> Ignoring this relationship is possible, but if I send as a
> reply to a GET on <A> a RDF representation of today weather
> in Oaxaca, I can't add the property cc:license, since this is
> about the _information resource_ B. To solve this I can
> redirect to <B> _or_ (in my opinion) add <B> as a
> Content-Location. Maybe slightly irrelevant, but adding <B>
> as a Content-Location sets the BaseURI accordingly, and I can
> add all RDF in one go (in turtle):
>
> <A> w:maxTemperature "28"^^w:centigrade.
> <A> w:minTemperature "12"^^w:centigrade.
> <> cc:licence <>.
>
> (Since BaseURI is mapped to <B>.)

Well... I'd argue that that is *not* a representation of the weather in Oaxaca, rather it is a representation of a set of measurements made by a weather station (or a collection of weather stations) in the relevant geography; that the information is attributable and copyrightable etc.

> > > By adding the header Content-Location: <B> to the response to a 303
> > > redirect from <A>, we will be able to find the information resource
> > > even when faced with automatic redirects in user agents.
> >
> > This is where I begin to see confusion between content negotiation and redirection.
> >
> > Why would Content-Location: (which would be bogus because
> > it refers to the content of the specific response - which has
> > none) be better than the Location: header strongly suggested
> > for use with 3xx responses?
> >
>
> My intention was of course that the Content-Location header
> is sent along with the final 200 response. This is of course
> redundant information if you know that it was redirected, but
> it is still correct. My point was that since you might not
> see the 303 response and its Location header you need to
> propagate this information to the final response.

Well it was there on the wire. It remains may opinion that a mistake is made in either specifying, building or selecting a platform that provides no means to make those details available to its users.

>
> > > This resolves
> > > (2) but what happens to (1)? Since the redirected response code is a
> > > 200, the de facto result is that a representation is served directly
> > > from <A> from the point of the user.
> >
> > Well the user (or User Agent) SHOULD be aware that the
> redirection has occurred - that the bits they end up with
> didn't come from the resource they originally referenced; and
> an SHOULD have an indication of where the bits they got came from.
> >
>
> Yes, but SHOULD on the user agent is not a MUST, isn't it?

Sorry, that was a natural language should not a 2119 SHOULD in the sense that it was an expression of my opinion.

That said, the mistake is made and deployed and I suppose we will have to live with it and try to find a way to work around it.

> > > This means that we can scrap the redirection part from httpRange-14,
> > > and only worry about the Content-Location header.
> >
> > I think you are confused and that Content-Location: serves a different purpose.
> >
> > I suppose target of the redirection could make a self
> > reference with Content-Location: which would be a way of not
> > having to remember the intermediate redirection target.
> >
>
> That was my intention. Sorry for my befuddled language.
>
> > > If the header is set its value is the URL denoting the content B. It
> > > doesn't matter whether you used redirection or served the
> > > representation straight from <A>.
> >
> > Content-Location: makes a claim about the relation between
> > the resource referenced by the URI received on the request
> > line by the server and the resource referenced by the URI in
> > the header, and it is saying that the latter is a variant of
> > the former. <B> variantOf <B>  doesn't seem hugely useful
> > (except as a pragmatic means to be able to forget what you
> > asked for, or to discover that you've been given the answer
> > to a different question than the one you asked).
> > Content-Location: says nothing of the relation between <A> and <B>.
> >
>
> As I said above, if the redirect is hidden the URI in the
> request is different to the one in the Content-Location, thus
> saying that <B> variantOf <A>.

Nope... (but we don't need to agree this). The header and it's meaning are vested in the message exchanged 'on the wire'. That your API hides some of that from you is a problem (at least for you) - but that hiding cannot change what the header and its value means.

> If not it becomes <B>
> variantOf <B>, witch at least is not wrong. As for saying
> nothing about the relation between them; neither does 303
> redirection by itself as you said in your caveat ("it is not
> in general true that following a 303 will lead to a
> de[s]cription/depiction of the original referent - but is a
> mechanism that *can* be usefully employed to do so")

Ok, fair point.

> , but
> httpRange-14 adds a possible claim on the relation. I am just
> arguing for a similar reinterpretation of Content-Location.
>
> > Maybe being intentionally blind: I fail to see what using
> > Content-Location: instead or aswell as Location: buys you. It
> > certainly conveys no more information and it risks confusion
> > between redirection and content negotiation.
> >
> > I say intentionally, because I can see that a
> > self-referential Content-Location: accompanying the final 200
> > response is a way to 'sneek' the information that some http
> > client library failed to preserve from the redirection back
> > to User/UserAgent - but FWIW IMO it is the http client
> > library which is at fault - the redirection should be visible
> > to the library client.
> >
>
> The client library is not at fault. Should still doesn't make a must.
> You could argue to change the HTTP specification, but is this
> a step you are willing to take?

I was merely offering grudging acknowledgement of the pragmatic utility without yielding on my opinion that the client library is at fault.

> The whole debate of information resources has in my meaning
> been most thoroughly argued by Roy Fielding in [2]. In this
> he notes though that "Content-Location is not a sufficient
> fix for this problem simply because the resource provider has
> no desire to use it." However, using redirect for the same
> purpose is essentially the same dilemma - getting resource
> providers to adhere to a specific convention. I only want to
> argue that using Content-Location is a better solution
> technically, and I think persuading people to use either
> solution will be difficult. However, in the Linked Data
> movement there are a few large players that seem to follow
> what the TAG group recommends, and this is a big opportunity.
> Just because Content-Location didn't catch on before, doesn't
> men it won't this time.

I'm going to pause for others to comment (or silence ;-)).

> Regards, Tore
>
> [1] <http://www.w3.org/DesignIssues/Generic>
> [2] <http://lists.w3.org/Archives/Public/www-tag/2002Aug/0000>

Regards,

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Friday, 30 November 2007 15:59:14 UTC