Re: Request for feedback: HTTP-based Resource Descriptor Discovery (2) from Jonathan Rees on 2009-01-29 (www-tag@w3.org from January 2009)

From: Jonathan Rees <jar@creativecommons.org>
Date: Thu, 29 Jan 2009 12:29:39 -0500
To: Eran Hammer-Lahav <eran@hueniverse.com>
Cc: "www-tag@w3.org WG" <www-tag@w3.org>
Message-Id: <64890ED7-B78C-49EC-80D3-34C83C8AFA11@creativecommons.org>
Left out a couple of other comments. If you forward my previous  
messages, tack these on the end.

By the way I hope it is obvious that I'm speaking for myself, not for  
the TAG.

- Under <link> element (section 7), please include XHTML along with  
HTML (this came up on a TAG telecon). I'm not sure either of these is  
"fully compatible with" the link header, but the DR discovery protocol  
doesn't need full compatibility, it only needs a way to express  
describedby links. But neither format specifies 'describedby' or leads  
one to the new registry that will contain it. For XHTML with RDFa, you  
can write describedby by giving a CURIE for the relation URI, so  
that's OK. I have no idea what HTML says about link relations that  
aren't explicitly documented, but I would expect a problem, from a  
standards point of view if not in practice.

- I understand that we desire to stay away from a rigorous treatment  
of authentication, authority, and authorization, leaving that up  
either to risk acceptance or an orthogonal security infrastructure.  
However, we need to specify what the protocol's position is on  
attribution, in the situation where communication *is* secure and/or  
risks are accepted.

<link> has problems in this regard that Link: and site-meta don't.  
Although in the normal case a document speaks for the owner of the URI  
that names it, there are important cases where this doesn't hold. One  
is where the resource is obsolete, so that what it said before is no  
longer true. This is not just a mistake to be fixed as faithfully  
retaining unmodified old versions is often very important. Another is  
where the URI owner is acting as an archiving or mirroring service and  
is bound by expectation or even by contract to serve information that  
it itself may not believe. Examples would include the Internet  
Archive, W3C, IETF, and OCLC (purl.org).

 From a communication point of view, <link> is the best of the three  
methods to link to a DR since there is the least risk that it will get  
detached from the representation. (Even better would be a way to  
include a DR in the document itself -- this is in effect a goal of  
RDFa.) But from an attribution point of view, it is weak, since there  
are many normal situations in which the <link> to the DR, and  
therefore the DR itself, may come to not speak for the URI owner).

Since we're not talking about protecting against security attacks  
here, but rather against well-meaning misinterpretation that might  
happen even in a fully secure, authenticated context, this is not a  
security problem. But your memo does talk about authority (here I  
think we mean what statements can be put in the mouths of what  
principals) as if it's a question it cares about. I think the problem  
of whether <link> speaks for the URI owner ought to be addressed  
somehow. You could deny that the protocol, correctly executed, says  
anything about it; or you could give a warning about the risks. But  
given that you've invoked the word "authority" elsewhere and that  
(IMO) this is not a question of protection against attack, I think  
it's dangerous to leave the disclaimer to the 'security  
considerations' section.

Option 1 (maybe for intro): The protocol is one particular way to lead  
you from a URI to a single DR. It is what it is. The DR has no  
particular standing by virtue of resulting from the protocol, even in  
the absence of attack. To authenticate the DR as speaking for anyone  
in particular, you have to do something else.

Option 2: Same as above, except that DRs obtained via Link: or site- 
meta (not <link>!) speak for the URI owner, assuming correct and  
secure implementation of this protocol. I think this is my preference  
as it is much stronger -- it lets you hear from the URI owner, even if  
you don't know who they are!

"URI owner" (defined in AWWW) is a bit fuzzy of course. For HTTP it's  
the same as the agent that gets to decide what HTTP responses to  
deliver for the URI. Figuring out who this is is tricky given that  
delegation (originating from the domain name owner) can be secret and/ 
or informal. But for the "identification" application you don't care  
who the owner is, only what they say that bears on what AWWW gives  
them authority over (URI/resource binding and delegation of URI  
ownership).

Ideally we'd have some kind of MUST somewhere: If a server  
implementing this protocol provides a DR URI, then the DR must speak  
for (be authorized by) the URI owner (unless the DR was found via  
<link>, in which case that needs to be made evident somehow). Ideally  
this would be true of *all* uses of Link: rel=describedby and site- 
meta - any server using this relation in this way should make sure the  
DR speaks for the URI owner (as opposed to, say, the domain holder).  
Maybe this should be documented in the published description of  
describedby, or in the Link: RFC, since not everyone reading those  
will necessarily be aware of the resource discovery protocol. Or maybe  
I've convinced you to pick option 1... but that leaves me short of the  
kind of communication channel I'd like to have.

Obviously I'm extemporizing here, so please take the preceding as  
brainstorming and not criticism or demand.

Best
Jonathan
Received on Thursday, 29 January 2009 17:30:21 UTC