Re: Using HTTP Headers to Request Descriptors (was RE: [XRI] Back to XRI) from John Bradley on 2008-09-13 (www-tag@w3.org from September 2008)

From: John Bradley <john.bradley@wingaa.com>
Date: Sat, 13 Sep 2008 14:04:47 -0700
To: "Jonathan Rees" <jar@creativecommons.org>
Cc: www-tag@w3.org, Drummond Reed <drummond.reed@cordance.net>
Message-Id: <D08CA5CE-6126-427E-8520-38265298795A@wingaa.com>
Hi Jonathan,

Thanks for the thoughtful response.

Let me step back and frame some of the dialog with the TAG for you.

XRI syntax is defined in http://docs.oasis-open.org/xri/xri-syntax/2.0/specs/cs01/xri-syntax-V2.0-cs.html

XRI is intended to be and identifier syntax for "non-information  
resources".
The syntax allows for persistent and non-persistent identifiers that  
can be differentiated by inspection of the identifier.
The syntax allows for other identifiers to be used as cross references  
this includes any scheme IRI/URI or other non IRI/URI identifiers  
inside cross references.

There is a XRI meta-data discovery protocol defined in http://docs.oasis-open.org/xri/xri-resolution/2.0/specs/cs01/xri-resolution-V2.0-cs-01.html
XRI resolution is intended to be a meta-data discovery protocol that  
allows for the selection and retrieval of XML formated meta-data  
regarding the XRI identifier.
The discovery process uses XRD formated XML documents to navigate to  
the requested meta-data for the "non-information resource"
The discovery/resolution process is designed to be protocol  
agnostic.   It deals with the retrieval and processing of the XRD  
documents.
There are http: and https: bindings for the XRI resolver.  Work on  
SS7, XMPP, and SFTP bindings are also underway.

The XRD documents and there assembly into XRDS documents by a resolver  
introduce a polymorphic quality between the XRI identifiers  and the  
meta-data that describes the "non-information resource".
In my case two XRI =jbradley and =ve7jtb both resolve to the same meta- 
data.  This is achieved by utilizing a canonical ID expressed in the  
XRD and verified in the XRDS chain.
You can see this polymorphism being used in openID.

XRI resolution also defines HXRI as a way of expressing XRI as  
http(s): scheme URI via a proxy server.
This enables you to use https://xri.net/=jbradley in a web application  
where something may attempt to treat it as a "information resource"

Simple enough, but for the fact that in the specs XRI are described in  
terms of there own IRI scheme xri:.

This caused the TAG to issue http://lists.w3.org/Archives/Public/www-tag/2008May/0078
Ultimately resulting in the defeat of the XRI specs as OASIS Standards.

We are currently working with the TAG understand there contention that  
"We are not satisfied that XRIs provide functionality not readily  
available from http: URIs."

XRIs are more closely related conceptually to urn: than http: in the  
opinion of the XRI-TC.

We are trying to see if and how the qualities expressed in XRI syntax  
and meta-data discovery are readily available in http: URI.

One approach that seems the most promising comes from David Booth and  
involves setting aside part of the http(s): entity space so that it  
can have more specific semantics.

As an example the DNS authority XRI.NET can say that all the  
identifiers under it are for "non-information resources"  and that XRI  
syntax regarding persistence and cross references can be interpreted  
as such by inspection of the URI.

This is being referred to as a sub scheme or http: URI profile.

One of the open questions is, are a documented system of http: sub- 
schemes preferable to people reregistering URI schemes?

There is this uneasy fit between http: as a protocol and the notions  
of XRI identifiers.

We do understand that a string of characters prefixed by http: can be  
used as an opaque identifier in protocols other than http: in fact  
thats the principle behind XRI cross-references.

So is it important if the meta-data discovery service uses xri:// 
=jbradley or https://xri.net/=jbradley as its input as long as it  
understands that both are XRI?
No I don't think that it makes any difference to the discovery service.

Where it gets tricky is not violating the principles of AWWW if the  
identifier gets used outside of the XRI identifier context.
If it has its own scheme browsers etc have a known way to deal with it  
as a known or unknown scheme.
As a http:scheme URI unless they understand these new sub-scheme rules  
they will assume it is a URI related to a "information resource".

Trying to address this leads us to Link headers and 303 redirects.

Where I think we have the largest problem is with XRDS-Simple and some  
of there use cases to use non sub-scheme URI like http://yahoo.com as  
"non-information resources"  while still being a "information  
resource"  for conventional web browser access.

Perhaps it is just a limitation of http: the protocol that we can't  
resolve.

At the moment XRDS-Simple is overloading content negotiation.  That  
isn't working for larger sites due to the obvious caching issues.
They are currently heading down the road of using a custom header to  
indicate that Link headers should be included in the response.

So to your question we do have a discovery protocol that could be used  
with XRI in http: form if we formalize a sub scheme.

Currently the hxri proxy server uses query parameters to differentiate  
between the requests for different types of meta data.
As an example the xrds document for =jbradley and a 303 redirect to  
the blog of =jbradley.

There is work to do defining how this fits with AWWW in some sensible  
way.
At the moment as you point out the two mechanisms we have been  
directed to for http: compatibility are:
1.  Not in the current http: and only a proposal in the case of Link  
headers
2. New and lacking consciences for 303 redirects.
3. Still a rough idea in the case of http: sub-schemes

If XRI retains its native discovery process and we sort out how to use  
this functionality readily available from http: URI,  we should not  
require a xri: scheme.

Your help and feedback in the process will be greatly appreciated.

Thanks again
John Bradley



On 13-Sep-08, at 10:43 AM, Jonathan Rees wrote:

> Using *any* kind of special header to try to make a GET do something  
> other than getting a representation of the denoted information  
> resource (speaking TAGese now) would violate the HTTP 1.1 RFC. This  
> applies to equally to content negotiation, a client-provided Link:  
> header, and any other similar hack. Not only would it violate the  
> RFC but it would risk confusing caches, proxies, and other  
> intermediaries and tools.
>
> If you want to use a protocol that is beginning to get traction with  
> both servers and clients, and works for any referent, use 303. (303  
> works fine with information resources, except that a client has to  
> do something other than a direct GET to obtain a representation -  
> basically use a new convention, on which there is currently no  
> consensus. This may or may not be a fatal flaw, depending on  
> application requirements.)
>
> If you want something that seems to be accepted and may get traction  
> in the near future, and works equally with 20x and 30x responses for  
> information resources, use Link:  The use I had imagined for Link:  
> was not to directly give metadata details such as dc:creator in  
> Link: response headers, but rather to do something 303-like and have  
> a single Link: header giving the location of a metadata record.
>
> If you want to avoid the extra round trip, you cannot use HTTP GET,  
> period. Metadata services and protocols abound, but the only such  
> protocols I'm familiar with that are deployed fairly directly using  
> HTTP are URIQA and ARK. Personally I find URI manipulation (a la  
> ARK) to be very appealing. I am sorry that I didn't get why this was  
> ruled out; that's probably my fault for not paying attention, since  
> someone did explain it. I'll look again.
>
> An alternative to MGET is PROPFIND, which I haven't investigated.  
> PROPFIND might have the advantage of being able to leverage existing  
> DAV protocol stacks.
>
> There is nothing that says you can't use your own protocol with  
> http: URIs. The argument is that if you use http:, then you will be  
> "on the web" in a way that you wouldn't be otherwise. But URIs are  
> just strings after all, suitable for use as keys into any kind of  
> index, and nothing says that HTTP is the only way to use them. For  
> example, http: URIs are used with the SPARQL protocol quite nicely;  
> no one has to do a GET of the URIs in question in order to get  
> information about their referents.
>
> If you are headed in the direction of using some protocol or  
> protocol variant with little current deployment anyhow, then there  
> is no reason not to use URIQA, SPARQL, web services, or anything  
> else. If you already have your own protocol with implementations,  
> there's no particular reason not to adapt it to http: URIs. You will  
> be no worse off than you were before, and the syntactic use of http:  
> gives you the advantage that clients that don't speak your protocol  
> have a fighting chance of doing something useful with the URIs.
>
> You have stated the ability to distinguish XRIs from other URIs as a  
> requirement, and I assume that this having been met an XRI-enabled  
> client could decide whether to apply whatever special metadata  
> access protocol you care to use. If you want to be able to somehow  
> exploit clients that are not currently XRI-enabled then 303 and  
> Link: are probably your best bet in the short run.
>
> I'm not following the entire discussion in detail and apologize if I  
> have overlooked some additional requirement that would speak against  
> any of the above.
>
> I agree with you that a uniform, widely deployed metadata protocol  
> not requiring that extra round trip and not specific to any  
> particular class of URIs (such as XRIs or HXRIs) would be a nice  
> thing, but while technically this is very easy, socially and  
> politically it does not seem to be coming about. If you want to  
> discuss this, or to design and promote something that could be used  
> not just for HXRIs but for all URIs (or all http: URIs), I wouldn't  
> object.
>
> Best
> Jonathan
>
> On Sat, Sep 13, 2008 at 1:02 AM, Drummond Reed <drummond.reed@cordance.net 
> > wrote:
> John,
>
> Thanks for the examples. They help illustrate the concept very  
> clearly. I agree it seems quite solid, but I am not an HTTP expert  
> (and don't even try to play one on Internet TV ;-), so I'm keenly  
> interested in what TAG members and others on the list think.
>
> =Drummond
>
> From: John Bradley [mailto:john.bradley@wingaa.com]
> Sent: Friday, September 12, 2008 9:41 PM
> To: Drummond Reed
> Cc: 'Booth, David (HP Software - Boston)'; www-tag@w3.org
> Subject: Re: Using HTTP Headers to Request Descriptors (was RE:  
> [XRI] Back to XRI)
>
> Drummond,
>
> I think you have the essence of the idea.
>
> XRDS-Simple uses the mime type to indicate that you want the meta  
> data for the object.
>
> I think that TimBL is correct that this is a inappropriate  
> overloading of the accept header.
>
> People do it because there dosn't seem to be another good alternative.
>
> I found reference to Nokia's MGET,  but asking to add new methods to  
> http, might make XRI look like a good idea in comparison(to some  
> people).
>
> http://sw.nokia.com/uriqa/URIQA.html
>
>
> Repeating what I said above: Don't confuse identifier syntax with  
> protocol. The two are almost orthogonal. You can compare URIQA with  
> some XRI-specific protocol, and indeed the XRI protocol might come  
> out ahead; I wouldn't argue one way or the other. But the protocol  
> comparison has no bearing on the choice of identifier syntax (http:  
> versus xri: scheme) since either protocol would work equally well  
> with either kind of identifier (although an existing xri protocol  
> might needs tweaks).
>
> I want a simple way to directly request the meta-data that is  
> associated with a "non-information resource"
>
> Apologies in advance to Ray Denninburg for butchering a book example.
>
> If the non-information resource is a book for instance say "So long,  
> and thanks for all the fish"
>
> Lets take the WorldCat URI as the example identifier.
>
> http://www.worldcat.org/isbn/0517554399
>
> This is referred to by different people as  a "abstract identifier",  
> "identifier of abstract resources", and "non-information resource  
> identifier"
>
> If I want meta data for the author of the resource I can do a head  
> on the URI and inspect the Link headers to find something like:
>
> Link: <http://www.worldcat.org/search?q=au%3ADouglas+Adams>; rel="http://purl.org/dc/elements/1.1/author 
> "
>
> If I knew in advance that I wanted the author meta-data I could  
> include a Link Header in the request such as:
>
> Link:  rel="http://purl.org/dc/elements/1.1/author"
>
> This asking the server for the Author metadata by referencing Dublin  
> Core.
>
> The server can include the above link header in the response and  
> give me a 303 redirect to the URI for the Author meta-data.
>
> In some cases the server may reply directly with the meta data but  
> still include the link header for the URI to directly access the  
> meta-data resource.
>
> In the real wold the web is not a friendly place for non-information  
> resource identifiers so my example is a bit contrived.
>
> I am trying to imagine something that fits the XRDS-Simple/HXRI use  
> cases but is general enough to be used by others within the AWWW  
> architecture.
>
> Having the web server take SPARQL queries in the header would be  
> interesting but overkill.
>
> Using the link header as a way to make a query or do metadata  
> negotiation as some may put it,  seems like a reasonable proposition.
>
> I am entirely open to and interested in counter proposals
>
> Regards
>
> John Bradley
>
> On 12-Sep-08, at 4:55 PM, Drummond Reed wrote:
>
> John,
>
> I changed the subject line because the approach you suggest for  
> using an HTTP Link header to explicitly request a description of a  
> resource ("descriptor") seems particularly promising. "Finding  
> Resource Descriptions" [1] has many good references to discussions  
> around this topic, but most of them seem focused on how to return  
> links to descriptors in HTTP responses, not how to explicitly  
> request them. Other resource descriptor formats like POWDER [2] also  
> seems to focus on Link headers in responses vs. requests.
>
> What I like about putting this semantics in a request header is that  
> it could be explicitly defined to mean: "If possible, give me (or  
> redirect me to) a descriptor of the target resource that has this  
> specified relationship (rel= value) to the target resource." And if  
> the value of the rel attribute was a URI, then there would be no  
> limit to the types of descriptors that could be requested and  
> potentially returned directly, without any extra round trips and  
> with very precise semantics.
>
> In effect, it would be like the client explicitly asking the server  
> for a 303, but being able to specify the precise type of related  
> resource the client is seeking, and for the server to actually  
> return that resource directly if it has the ability to do so. And  
> because the semantics would be explicit that the client is asking  
> for a descriptor of the resource and not the resource itself, it  
> would get around the problem described near the end of [1]:
>
>             "If you ask for RDF, you get the description. If you ask  
> for something else, you get the thing described. (The TAG, TimBL,  
> and others have pointed out that this contradicts web architecture,  
> which requires that content negotiation choose among things that all  
> carry the same information. That goes for CN between RDF and HTML as  
> much as it does for CN between GIF and JPEG.)"
>
> Do I understand this correctly?
>
> =Drummond
>
> [1] http://esw.w3.org/topic/FindingResourceDescriptions
>
> [2] http://www.w3.org/TR/powder-dr/
>
>
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Saturday, 13 September 2008 21:05:35 UTC