[schemeProtocols-49] Notes on discussions at F2F and 21 June 2005 Telcon

I see that we've scheduled more discussion of schemeProtocols-49 for later 
this afternoon.  Since I was on the road at the schema meetings last week, 
there hasn't been formal progress on the drafts, but I thought it might be 
helpful to set down an outline of the concerns and ideas that I picked up 
at the F2F and on last week's call.  We won't want to discuss all of these 
today, but we can pick and choose some if you like.

One other general note before proceeding to the details below:  I 
undertook this finding in part because I'm not expert in all the nuances 
of this area.  My intention was to become more expert, while taking a 
novice's view of what's confusing and what should be clarified.  Getting a 
clear story is proving more difficult than I had expected, either because 
this work would be better done by someone else, or because I really am 
stumbling onto significant points of confusion on which the experts 
disagree.   For now, I think the best course is for me to prepare a 
revised draft.  Having text to discuss is usually the best way to get 
clarity.  If thrashing sets in, we can either drop it or assign someone to 
help me.  I understand that the TAG has no consensus as to whether a 
finding is needed at all, regardless of author, and I'm more than willing 
to invest some more time on that basis. 

Here are some of the concerns I heard raised at the F2F and last week's 
call, along with my preliminary responses: 

Scope and Focus of the Finding

Q. Dan asked: is this finding needed?  What problem is it solving?   Dave 
Orchard "wonders what the motivation/problem is that this finding 
addresses." [1] 
A. A couple of answers:

There is a problem today insofar as certain sorts of content (high quality 
multimedia) and delivery mechanisms (some P2P) are incompletely integrated 
into the Web experience.  My hope is that by writing a finding that says: 
"the Web is architected to embrace such richer information content, and 
here's good practice for doing the integration", we are making a visible 
statement that such integration should happen, as well as suggesting good 
practice on the details.  For example, we have an opportunity to explore 
when http-scheme URIs should be used with P2P delivery systems, whether 
the interaction should start with an actual HTTP-protocol retrieval of 
some sort of RTSP-like file [2] with a description and/or digital 
signature of the media stream, etc.

I think that recent email discussion has shown disagreement among fairly 
knowledgeable experts about questions such as:  to what degree does a 
protocol like HTTP play a definitional role in the namespace for the http 
scheme? [3,4]  To what degree is it appropriate for there to be such a 
definitional role when inventing new schemes?  I think there is value in 
having a clean story to tell about the architecture and how its 
foundational elements work, even before users are actually tripping over 
the confusion.

Q.  OK, if those are the goals, why doesn't the draft doesn't state them 
well? Why isn't the technical discussion more obviously connected to the 
goals?
A.  Good points.  I'll need to address in future revisions.  To some 
degree those shortcomings were a result of taking a snapshot in time for 
the F2F, as I intended to make the connections clearer in sections yet to 
be written.

Q. (from Henry)  Shouldn't the trust and social issues relating to schema 
& protocol deployment (see technical discussion below) be beyond the scope 
of the schemeProtocols-49 issue and draft finding?
A.  I don't think so.  Those concerns are an important aspect of teaching 
people good practice in relating the use of URI schemes to deployment 
using protocols.  As suggested on the call last week, I'd like to try to 
include coverage of this in a future draft.  We can always pull it out. 
See technical and architectural concerns below.

Suggestion (from Dave Orchard):   The architecture document benefitted 
from leading with stories about what can go right and wrong.  This draft 
might benefit from the same approach, e.g. exploring BitTorrent.
A.  Good idea.  I'll try it in future drafts.

Technical and Architectural Concerns

Here's a summary of some of the technical and architectural issues on 
which I'm focussing as a result of our discussions:

Q.  To what degree can or should a particular protocol play a definitional 
role regarding the association of resources to URI's in a particular 
scheme?
A.  I think that everyone agrees that when inventing a new URI scheme, one 
has a great deal of latitude in defining the association of URIs in that 
scheme to resources.  So at least in principle, one could define the 
resources to be those served by a particular protocol.  Note, however, 
that protocols tend to evolve over time.  If a scheme's information space 
is defined in terms of a protocol, consideration should be given to 
possible evolution of the protocol.

Q.  To what degree does HTTP in particular play such a definitional role 
for the http URI scheme?
A.  Here I'm not 100% sure I understood all the answers I got, and mail 
like [3,4] leads me to believe that some of my confusion may trace to 
disagreement among those who are offering answers.  Getting to the bottom 
of that disagreement seems like a useful exercise.  Here's some of what 
I've learned recently:
Tim Berners-Lee and Roy reminded me that HTTP is now a family of 
protocols, currently including HTTP 1.0 and HTTP 1.1.  The family is 
potentially extensible with other perhaps radically different protocols to 
be named HTTP x.y in the future.   Thus, there are at least two 
subquestions: (a) Is there something about the whole family of HTTP 
protocols that plays a distinguished role with respect to the definition 
or use of the http scheme?  (b) What is the story about the role or 
responsibility of each particular protocol, such as HTTP 1.1?  Roy made 
the point that the information space is indeed the same for HTTP 1.0, 1.1, 
etc, and in that sense the space is not tied to any one of the protocols. 
Conversely, although the HTTPS protocol is manifestly a tunneled version 
of the HTTP protocol, the https scheme by definition implements what is 
architecturally a separate information space, because the scheme name is 
different. 
Tim acknowledged that there's been some architectural evolution over time: 
 schemes and protocols used to be more 1-to-1, but then we needed HTTP 
1.1, etc.  Noah thinks that explaining such architectural evolution may be 
useful in the finding. 

Q.  Might it ever be appropriate to serve resources named with the http 
scheme using some protocol that is radically different from HTTP 1.x, a 
P2P protocol for example?
A.  There seem to be several angles to explore:
As discussed above, there are trust and social issues relating to one's 
confidence in a retrieval.  In the case of HTTP 1.x, we are relying on the 
integrity of DNS (e.g. no polluted caches) as well as the good intentions 
of the server administrators, etc.  With a P2P protocol, we may be served 
representations put into the system by many parties.
We should probably document at least one approach that has proven useful 
in certain cases: I.e. to do an initial interaction using the HTTP 1.x 
protocol to retrieve some sort of RTSP-like description document [2].  The 
media type and content of the retrieved representation then instruct the 
client on how to retrieve that actual stream or content to be presented. I 
believe that the Real system does this, at least in certain cases.  Note 
that this approach can in principle provide a digital signature to ensure 
that content eventually retrieved from a P2P system is in fact what was 
originally supplied for the resource, which is one way to attack the 
social/trust issues raised above. 
The above approach has the drawback of requiring an actual HTTP-protocol 
interaction as a preliminary to using the P2P, streaming, or other 
protocol to retrieve content.  We should consider whether it is 
appropriate, and if so under what constraints, to directly serve an 
http-scheme resource using a non-HTTP protocol.
New protocols such as P2P could be designated as HTTP x.y (see immediately 
below).

Suggestion: You need to make clearer the long term value of the 
http-scheme information space.
A: The http-scheme information space has great value.  Pages around the 
world are deployed with links using that space.  As Tim pointed out, one 
of our options is to eventually designate some particular new protocols, 
perhaps a P2P protocol, as HTTP x.y.  When we deploy new protocols to 
support that information space, many of those links will continue to work 
(and Noah thinks that exactly this sort of reasoning is what we have to 
explain in a finding...why exactly does naming the protocol HTTP x.y make 
it more likely that such links will resolve?  We seem to be connecting the 
family of protocols to the scheme.  If so, then somewhere in the 
architecture we should say that.) 
 
Q. (from Henry) The finding talks a lot about gateways.  Are they in fact 
widely used?
A. Yes.

Q (from Noah) Are operations such as GET/POST inherent in the scheme or 
the protocol?
A. (from various) No consensus.  (Noah thinks this is another area where 
getting a straight story to tell in a finding might be useful.  Then, if 
we want to serve http-scheme resource suing some P2P protocol, we'll have 
a clean story on whether that protocol needs to support GET/POST with 
HTTP-protocol semantics.)

I hope the above at least convinces everyone that I got a lot of useful 
feedback from our recent discussions.  When I return from vacation, I'll 
try and factor the main points into a revised draft.  Thank you.

Noah

[1] 
http://www.w3.org/2001/tag/2005/06/14-16-minutes.html#item013"http://www.w3.org/2001/tag/2005/06/14-16-minutes.html#item013
[2] http://www.rtsp.org/
[3] 
http://lists.w3.org/Archives/Public/www-tag/2005Jun/0027.html"http://lists.w3.org/Archives/Public/www-tag/2005Jun/0027.html
[4] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0036.html

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Tuesday, 28 June 2005 16:54:01 UTC