Re: 200 OK with Content-Location might work from John Sheridan on 2010-11-07 (public-lod@w3.org from November 2010)

From: John Sheridan <john@johnlsheridan.com>
Date: Sun, 07 Nov 2010 17:38:47 +0000
To: Niklas Lindström <lindstream@gmail.com>
Cc: public-lod@w3.org
Message-ID: <1289151527.2680.16.camel@john-PC>
Hi Niklas,

On Sun, 2010-11-07 at 16:57 +0100, Niklas Lindström wrote:
> Hi John!
> 
> I understand your points. I also don't think that 303 is a poor
> solution in any fundamental way. In fact, given the use-case you
> described, having a stable URI which "delegates" to the current
> location is perfectly fine, and in many cases preferable to the
> alternative (demanding persistent, permanent hosting of all the data
> in a dynamic organizational environment).
> 

My first mail with this use case didn't make the list (I made a mistake
with the posting). For completeness, I repeat it below, so others can
follow how we are starting to use 303s across domains, in the hope of
improving the chances of persistent URIs for NIRs.

The use-case was with the Linked Data for UK Government, where we have a
URI for a NIR at one (notionally more stable) domain
which 303s to an IR at a different (less stable, organisationally
orientated) domain.

Often the NIR URI is something like
http://{something}.data.gov.uk/id/something whereas the IR is on an
organisation's own website.

eg http://reference.data.gov.uk/id/open-government-licence 303s to a web
page or RDF document published on The National Archives website.

We do this because organisations in the public sector are unstable and
subject to continual change (creation, merger, abolition) whereas the
government as a whole is very stable.

Our thinking is that the {something}.data.gov.uk URI is more likely to
survive machinery of government changes, but the organisation
responsible for the NIR is going to want to publish the IR about that on
its own website, and should be encouraged to do so.

We are thinking about using this pattern to create URIs for local
authorities, where each publishes their own IR on their own website,
303ing from a consistent URI Set for all local authorities, say at
http://local.data.gov.uk/id/local-authority-id

The 303 helps enable this pattern, is easy to implement (eg on Apache,
we can use regular expression based rules), and fits well in general
with some of the challenges on Linked Data in the public sector.

It could be a useful pattern more widely.

> In our own work with gov data I can see our current "central data
> repo" solution being turned into a PURL-like service for resources
> like agency descriptions which should in time be put at more
> appropriate, stable URL:s (combined with judicious owl:sameAs
> assertions).
> 
> I just think that the demand on NIR:s w/o hashes to be "directly
> unavailable" may be a hard hurdle to overcome when hosting data about
> them (and as said elsewhere, can be an unnecessary strain on servers).
> Milages certainly varies a lot, and simplifying the demand of 303:ing
> from </concept> to </concept.{dataformat}> to a baseline where conneg
> giving a Content-Location is formally enough can be very beneficial.
> In fact, this makes the distinction of NIR vs. IR less technical (and
> left to the descriptions of the resource to clarify), and just leaves
> us with the importance of distinguishing between a resource and its
> representations.
> 
> I basically think that the HTTP mechanics of 301, 302, 303 and 307
> etc. are great tools for sustainable Linked Data deployment. But
> having them dictate the fact that a URI represents a NIR or IR is
> putting a lot of conceptual design directly into a day-to-day
> protocol.. 

I have a lot of sympathy with this point.

> Of course, Content-Location doesn't remove the distinction
> either, but it puts more emphasis on the "resource vs representation"
> question, which holds for both resource kinds (AFAIK).
> 
> .. I usually steer away from discussions of the "NIR vs. IR" topic at
> least publically (though I love to discuss it in person), since it
> touches upon a very philosophical distinction which, taken to the
> extreme, can be an eternal discussion (of epistemology, phenomenology
> and whatnot). 

Off topic, try discussing whether a boundary for an administrative area,
defined by a line drawn *on* a map, is a NIR or an IR :)

> I hope that it is enough to say: if the resource does
> not intrinsically have the mimetype X ("represents itself", "is the
> record"...), use HTTP mechanics (30x, or 200 + Content-Location) to
> make it clear that the response body (having mimetype X) is not the
> resource itself, but a representation thereof)...
> 
> Best regards,
> Niklas
> 
> [For those wanting more things to ponder: consider the role and nature
> of a packet, or the essence of a byte.. ;) ]
> 
> 
> 
> 2010/11/7 John Sheridan <johnlsheridan@yahoo.com>:
> > Niklas,
> >
> > In general I am supportive of your and Ian's thinking. 200 OK with
> > Content-Location might work.
> >
> > However, three points from my perspective:
> >
> > 1) debating fundamental issues like this is very destabilising for those
> > of us looking to expand the LOD community and introduce new people and
> > organisations to Linked Data. To outsiders, it makes LOD seem like its
> > not ready for adoption and use - which is deadly. This is at best the
> > 11th hour for making such a change in approach (perhaps even 5 minutes
> > to midnight?).
> >
> > 2) the 303 pattern isn't *that* hard to understand for newbies and maybe
> > even helps them grasp LOD. Making the difference between NIRs and IRs so
> > apparent, I have found to be (counter-intuitively) a big selling point
> > for LOD, when introducing new people to the paradigm. Let's not be too
> > harsh on 303 - it does make an important distinction very clear for new
> > adopters and, in my experience, it seems to be an approach new people
> > grok quite quickly and easily.
> >
> > 3) I see much to commend in what Ian suggests, in practical terms. If
> > the community is going to move in that direction what we need is a clear
> > roadmap. An alternative pattern (say, 200 OK plus Content-Location)
> > needs to be (*very* quickly) alighted upon and then used in practice. We
> > would have to reconcile ourselves to the 303 pattern and the
> > alternative, operating side-by-side, for some period of time (years?).
> > Only once there is some breadth of usage, should the community seek to
> > deprecate the use of 303s. If this is a pattern the community wishes to
> > change, we have to gradually evolve our way to something different. We
> > can't just leap.
> >
> > Hope these thoughts help,
> >
> > John.
> >
> > On Sun, 2010-11-07 at 14:42 +0000, John Sheridan wrote:
> >> One use-case that we have with the Linked Data work for UK Government,
> >> is where we have a URI for a NIR at one (notionally more stable) domain
> >> which 303s to an IR at a different (less stable, organisationally
> >> orientated) domain.
> >>
> >> Often the NIR URI is something like
> >> http://{something}.data.gov.uk/id/something whereas the IR is on an
> >> organisation's own website.
> >>
> >> We do this because organisations in the public sector are unstable and
> >> subject to continual change (creation, merger, abolition) whereas the
> >> government as a whole is very stable.
> >>
> >> To give an example, the Open Government Licence (for the NIR of the
> >> licence) is http://reference.data.gov.uk/id/open-government-licence
> >> which 303s to
> >> http://www.nationalarchives.gov.uk/doc/open-government-licence/ (the IR
> >> of the current licence text, currently published by The National
> >> Archives, with HTML and RDF representations selected through conneg)
> >>
> >> We are looking at a similar pattern for local authorities. Each Council
> >> would have a NIR URI at (something like)
> >> local.data.gov.uk/id/{local-council-identifier} which would 303 to IR
> >> about that Council on the Council's own website.
> >>
> >> Our thinking is that the {something}.data.gov.uk URI is more likely to
> >> survive machinery of government changes, but the organisation
> >> responsible for (say) the Open Government Licence is always going to
> >> want to publish the IR about that on its own website, and should be
> >> encouraged to do so.
> >>
> >> The 303 pattern helps enable this pattern, which fits well in general
> >> with some of the challenges on Linked Data in the public sector.
> >>
> >> I would like to understand a little better how Ian's proposal maps to
> >> this use case.
> >>
> >> Grateful for comments,
> >>
> >> John.
> >>
> >> On Sun, 2010-11-07 at 12:11 +0100, Niklas Lindström wrote:
> >> > +1 indeed. Content-Location has definitely been overlooked. During
> >> > conneg, it is used to differ between a resource and its
> >> > representation(s), which are obviously different resources (well, not
> >> > necessarily the same). This distinction could certainly be enough to
> >> > remove the fundamental need for 303:ing on NIR:s (provided consensus
> >> > and some formal resolution).
> >> >
> >> > (I pondered on a similar issue in
> >> > <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2010Feb/0007.html>,
> >> > regarding the identity of fragments. Perhaps that discussion would be
> >> > worth revisiting again in light of this?)
> >> >
> >> > Best regards,
> >> > Niklas
> >> >
> >> >
> >> >
> >> > On Fri, Nov 5, 2010 at 5:55 PM, Nathan <nathan@webr3.org> wrote:
> >> > > Mike Kelly wrote:
> >> > >>
> >> > >> http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-12#page-14
> >> > >
> >> > > snipped and fuller version inserted:
> >> > >
> >> > >   4.  If the response has a Content-Location header field, and that URI
> >> > >       is not the same as the effective request URI, then the response
> >> > >       asserts that its payload is a representation of the resource
> >> > >       identified by the Content-Location URI.  However, such an
> >> > >       assertion cannot be trusted unless it can be verified by other
> >> > >       means (not defined by HTTP).
> >> > >
> >> > >> If a client wants to make a statement  about the specific document
> >> > >> then a response that includes a content-location is giving you the
> >> > >> information necessary to do that correctly. It's complemented and
> >> > >> further clarified in the entity body itself through something like
> >> > >> isDescribedBy.
> >> > >
> >> > > I stand corrected, think there's something in this, and it could maybe
> >> > > possibly provide the semantic indirection needed when Content-Location is
> >> > > there, and different to the effective request uri, and complimented by some
> >> > > statements (perhaps RDF in the body, or Link header, or html link element)
> >> > > to assert the same.
> >> > >
> >> > > Covers a few use-cases, might have legs (once HTTP-bis is a standard?).
> >> > >
> >> > > Nicely caught Mike!
> >> > >
> >> > > Best,
> >> > >
> >> > > Nathan
> >> > >
> >> > >
> >> >
> >>
> >
> >
> >
>
Received on Monday, 8 November 2010 08:12:50 UTC