Re: Using HTTP Headers to Request Descriptors (was RE: [XRI] Back to XRI) from Jonathan Rees on 2008-09-13 (www-tag@w3.org from September 2008)

From: Jonathan Rees <jar@creativecommons.org>
Date: Sat, 13 Sep 2008 13:43:42 -0400
To: "Drummond Reed" <drummond.reed@cordance.net>
Cc: "John Bradley" <john.bradley@wingaa.com>, "Booth, David (HP Software - Boston)" <dbooth@hp.com>, www-tag@w3.org
Message-ID: <760bcb2a0809131043q5f48144i24062cec1261b3cd@mail.gmail.com>
Using *any* kind of special header to try to make a GET do something other
than getting a representation of the denoted information resource (speaking
TAGese now) would violate the HTTP 1.1 RFC. This applies to equally to
content negotiation, a client-provided Link: header, and any other similar
hack. Not only would it violate the RFC but it would risk confusing caches,
proxies, and other intermediaries and tools.

If you want to use a protocol that is beginning to get traction with both
servers and clients, and works for any referent, use 303. (303 works fine
with information resources, except that a client has to do something other
than a direct GET to obtain a representation - basically use a new
convention, on which there is currently no consensus. This may or may not be
a fatal flaw, depending on application requirements.)

If you want something that seems to be accepted and may get traction in the
near future, and works equally with 20x and 30x responses for information
resources, use Link:  The use I had imagined for Link: was not to directly
give metadata details such as dc:creator in Link: response headers, but
rather to do something 303-like and have a single Link: header giving the
location of a metadata record.

If you want to avoid the extra round trip, you cannot use HTTP GET, period.
Metadata services and protocols abound, but the only such protocols I'm
familiar with that are deployed fairly directly using HTTP are URIQA and
ARK. Personally I find URI manipulation (a la ARK) to be very appealing. I
am sorry that I didn't get why this was ruled out; that's probably my fault
for not paying attention, since someone did explain it. I'll look again.

An alternative to MGET is PROPFIND, which I haven't investigated. PROPFIND
might have the advantage of being able to leverage existing DAV protocol
stacks.

There is nothing that says you can't use your own protocol with http: URIs.
The argument is that if you use http:, then you will be "on the web" in a
way that you wouldn't be otherwise. But URIs are just strings after all,
suitable for use as keys into any kind of index, and nothing says that HTTP
is the only way to use them. For example, http: URIs are used with the
SPARQL protocol quite nicely; no one has to do a GET of the URIs in question
in order to get information about their referents.

If you are headed in the direction of using some protocol or protocol
variant with little current deployment anyhow, then there is no reason not
to use URIQA, SPARQL, web services, or anything else. If you already have
your own protocol with implementations, there's no particular reason not to
adapt it to http: URIs. You will be no worse off than you were before, and
the syntactic use of http: gives you the advantage that clients that don't
speak your protocol have a fighting chance of doing something useful with
the URIs.

You have stated the ability to distinguish XRIs from other URIs as a
requirement, and I assume that this having been met an XRI-enabled client
could decide whether to apply whatever special metadata access protocol you
care to use. If you want to be able to somehow exploit clients that are not
currently XRI-enabled then 303 and Link: are probably your best bet in the
short run.

I'm not following the entire discussion in detail and apologize if I have
overlooked some additional requirement that would speak against any of the
above.

I agree with you that a uniform, widely deployed metadata protocol not
requiring that extra round trip and not specific to any particular class of
URIs (such as XRIs or HXRIs) would be a nice thing, but while technically
this is very easy, socially and politically it does not seem to be coming
about. If you want to discuss this, or to design and promote something that
could be used not just for HXRIs but for all URIs (or all http: URIs), I
wouldn't object.

Best
Jonathan

On Sat, Sep 13, 2008 at 1:02 AM, Drummond Reed
<drummond.reed@cordance.net>wrote:

>  John,
>
> Thanks for the examples. They help illustrate the concept very clearly. I
> agree it seems quite solid, but I am not an HTTP expert (and don't even try
> to play one on Internet TV ;-), so I'm keenly interested in what TAG members
> and others on the list think.
>
> =Drummond
>   ------------------------------
>
> *From:* John Bradley [mailto:john.bradley@wingaa.com]
> *Sent:* Friday, September 12, 2008 9:41 PM
> *To:* Drummond Reed
> *Cc:* 'Booth, David (HP Software - Boston)'; www-tag@w3.org
> *Subject:* Re: Using HTTP Headers to Request Descriptors (was RE: [XRI]
> Back to XRI)
>
> Drummond,
>
> I think you have the essence of the idea.
>
> XRDS-Simple uses the mime type to indicate that you want the meta data for
> the object.
>
> I think that TimBL is correct that this is a inappropriate overloading of
> the accept header.
>
> People do it because there dosn't seem to be another good alternative.
>
> I found reference to Nokia's MGET,  but asking to add new methods to http,
> might make XRI look like a good idea in comparison(to some people).
>
> http://sw.nokia.com/uriqa/URIQA.html
>

Repeating what I said above: Don't confuse identifier syntax with protocol.
The two are almost orthogonal. You can compare URIQA with some XRI-specific
protocol, and indeed the XRI protocol might come out ahead; I wouldn't argue
one way or the other. But the protocol comparison has no bearing on the
choice of identifier syntax (http: versus xri: scheme) since either protocol
would work equally well with either kind of identifier (although an existing
xri protocol might needs tweaks).

>   I want a simple way to directly request the meta-data that is associated
> with a "non-information resource"
>
> Apologies in advance to Ray Denninburg for butchering a book example.
>
> If the non-information resource is a book for instance say "So long, and
> thanks for all the fish"
>
> Lets take the WorldCat URI as the example identifier.
>
> http://www.worldcat.org/isbn/0517554399
>
> This is referred to by different people as  a "abstract identifier",
> "identifier of abstract resources", and "non-information
> resource identifier"
>
> If I want meta data for the author of the resource I can do a head on the
> URI and inspect the Link headers to find something like:
>
> Link: <http://www.worldcat.org/search?q=au%3ADouglas+Adams>; rel="
> http://purl.org/dc/elements/1.1/author"
>
> If I knew in advance that I wanted the author meta-data I could include a
> Link Header in the request such as:
>
> Link:  rel="http://purl.org/dc/elements/1.1/author"
>
> This asking the server for the Author metadata by referencing Dublin Core.
>
> The server can include the above link header in the response and give me a
> 303 redirect to the URI for the Author meta-data.
>
> In some cases the server may reply directly with the meta data but still
> include the link header for the URI to directly access the meta-data
> resource.
>
> In the real wold the web is not a friendly place for non-information
> resource identifiers so my example is a bit contrived.
>
> I am trying to imagine something that fits the XRDS-Simple/HXRI use cases
> but is general enough to be used by others within the AWWW architecture.
>
> Having the web server take SPARQL queries in the header would be
> interesting but overkill.
>
> Using the link header as a way to make a query or do metadata negotiation
> as some may put it,  seems like a reasonable proposition.
>
> I am entirely open to and interested in counter proposals
>
> Regards
>
> John Bradley
>
> On 12-Sep-08, at 4:55 PM, Drummond Reed wrote:
>
> John,
>
> I changed the subject line because the approach you suggest for using an
> HTTP Link header to explicitly request a description of a resource
> ("descriptor") seems particularly promising. "Finding Resource Descriptions"
> [1] has many good references to discussions around this topic, but most of
> them seem focused on how to return links to descriptors in HTTP responses,
> not how to explicitly request them. Other resource descriptor formats like
> POWDER [2] also seems to focus on Link headers in responses vs. requests.
>
> What I like about putting this semantics in a request header is that it
> could be explicitly defined to mean: "If possible, give me (or redirect me
> to) a descriptor of the target resource that has this specified relationship
> (rel= value) to the target resource." And if the value of the rel attribute
> was a URI, then there would be no limit to the types of descriptors that
> could be requested and potentially returned directly, without any extra
> round trips and with very precise semantics.
>
> In effect, it would be like the client explicitly asking the server for a
> 303, but being able to specify the precise type of related resource the
> client is seeking, and for the server to actually return that resource
> directly if it has the ability to do so. And because the semantics would be
> explicit that the client is asking for a descriptor of the resource and not
> the resource itself, it would get around the problem described near the end
> of [1]:
>
>             "If you ask for RDF, you get the description. If you ask for
> something else, you get the thing described. (The TAG, TimBL, and others
> have pointed out that this contradicts web architecture, which requires that
> content negotiation choose among things that all carry the same information.
> That goes for CN between RDF and HTML as much as it does for CN between GIF
> and JPEG.)"
>
> Do I understand this correctly?
>
> =Drummond
>
> [1] http://esw.w3.org/topic/FindingResourceDescriptions
>
> [2] http://www.w3.org/TR/powder-dr/
>
Received on Saturday, 13 September 2008 17:44:22 UTC