[schemeProtocols-49] Notes on discussions at F2F and 21 June 2005 Telcon from noah_mendelsohn@us.ibm.com on 2005-06-28 (www-tag@w3.org from June 2005)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 28 Jun 2005 12:15:22 -0400
To: www-tag@w3.org
Message-ID: <OFDDD28860.5F56B932-ON8525702E.00535889-8525702E.00594CEE@lotus.com>

I see that we've scheduled more discussion of schemeProtocols-49 for later
this afternoon. Since I was on the road at the schema meetings last week,
there hasn't been formal progress on the drafts, but I thought it might be
helpful to set down an outline of the concerns and ideas that I picked up
at the F2F and on last week's call. We won't want to discuss all of these
today, but we can pick and choose some if you like.

One other general note before proceeding to the details below: I
undertook this finding in part because I'm not expert in all the nuances
of this area. My intention was to become more expert, while taking a
novice's view of what's confusing and what should be clarified. Getting a
clear story is proving more difficult than I had expected, either because
this work would be better done by someone else, or because I really am
stumbling onto significant points of confusion on which the experts
disagree. For now, I think the best course is for me to prepare a
revised draft. Having text to discuss is usually the best way to get
clarity. If thrashing sets in, we can either drop it or assign someone to
help me. I understand that the TAG has no consensus as to whether a
finding is needed at all, regardless of author, and I'm more than willing
to invest some more time on that basis.

Here are some of the concerns I heard raised at the F2F and last week's
call, along with my preliminary responses:

Scope and Focus of the Finding

Q. Dan asked: is this finding needed? What problem is it solving? Dave
Orchard "wonders what the motivation/problem is that this finding
addresses." [1]
A. A couple of answers:

There is a problem today insofar as certain sorts of content (high quality
multimedia) and delivery mechanisms (some P2P) are incompletely integrated
into the Web experience. My hope is that by writing a finding that says:
"the Web is architected to embrace such richer information content, and
here's good practice for doing the integration", we are making a visible
statement that such integration should happen, as well as suggesting good
practice on the details. For example, we have an opportunity to explore
when http-scheme URIs should be used with P2P delivery systems, whether
the interaction should start with an actual HTTP-protocol retrieval of
some sort of RTSP-like file [2] with a description and/or digital
signature of the media stream, etc.

I think that recent email discussion has shown disagreement among fairly
knowledgeable experts about questions such as: to what degree does a
protocol like HTTP play a definitional role in the namespace for the http
scheme? [3,4] To what degree is it appropriate for there to be such a
definitional role when inventing new schemes? I think there is value in
having a clean story to tell about the architecture and how its
foundational elements work, even before users are actually tripping over
the confusion.

Q. OK, if those are the goals, why doesn't the draft doesn't state them
well? Why isn't the technical discussion more obviously connected to the
goals?
A. Good points. I'll need to address in future revisions. To some
degree those shortcomings were a result of taking a snapshot in time for
the F2F, as I intended to make the connections clearer in sections yet to
be written.

Q. (from Henry) Shouldn't the trust and social issues relating to schema
& protocol deployment (see technical discussion below) be beyond the scope
of the schemeProtocols-49 issue and draft finding?
A. I don't think so. Those concerns are an important aspect of teaching
people good practice in relating the use of URI schemes to deployment
using protocols. As suggested on the call last week, I'd like to try to
include coverage of this in a future draft. We can always pull it out.
See technical and architectural concerns below.

Suggestion (from Dave Orchard): The architecture document benefitted
from leading with stories about what can go right and wrong. This draft
might benefit from the same approach, e.g. exploring BitTorrent.
A. Good idea. I'll try it in future drafts.

Technical and Architectural Concerns

Here's a summary of some of the technical and architectural issues on
which I'm focussing as a result of our discussions:

Q. To what degree can or should a particular protocol play a definitional
role regarding the association of resources to URI's in a particular
scheme?
A. I think that everyone agrees that when inventing a new URI scheme, one
has a great deal of latitude in defining the association of URIs in that
scheme to resources. So at least in principle, one could define the
resources to be those served by a particular protocol. Note, however,
that protocols tend to evolve over time. If a scheme's information space
is defined in terms of a protocol, consideration should be given to
possible evolution of the protocol.

Q. To what degree does HTTP in particular play such a definitional role
for the http URI scheme?
A. Here I'm not 100% sure I understood all the answers I got, and mail
like [3,4] leads me to believe that some of my confusion may trace to
disagreement among those who are offering answers. Getting to the bottom
of that disagreement seems like a useful exercise. Here's some of what
I've learned recently:
Tim Berners-Lee and Roy reminded me that HTTP is now a family of
protocols, currently including HTTP 1.0 and HTTP 1.1. The family is
potentially extensible with other perhaps radically different protocols to
be named HTTP x.y in the future. Thus, there are at least two
subquestions: (a) Is there something about the whole family of HTTP
protocols that plays a distinguished role with respect to the definition
or use of the http scheme? (b) What is the story about the role or
responsibility of each particular protocol, such as HTTP 1.1? Roy made
the point that the information space is indeed the same for HTTP 1.0, 1.1,
etc, and in that sense the space is not tied to any one of the protocols.
Conversely, although the HTTPS protocol is manifestly a tunneled version
of the HTTP protocol, the https scheme by definition implements what is
architecturally a separate information space, because the scheme name is
different.
Tim acknowledged that there's been some architectural evolution over time:
schemes and protocols used to be more 1-to-1, but then we needed HTTP
1.1, etc. Noah thinks that explaining such architectural evolution may be
useful in the finding.

Q. Might it ever be appropriate to serve resources named with the http
scheme using some protocol that is radically different from HTTP 1.x, a
P2P protocol for example?
A. There seem to be several angles to explore:
As discussed above, there are trust and social issues relating to one's
confidence in a retrieval. In the case of HTTP 1.x, we are relying on the
integrity of DNS (e.g. no polluted caches) as well as the good intentions
of the server administrators, etc. With a P2P protocol, we may be served
representations put into the system by many parties.
We should probably document at least one approach that has proven useful
in certain cases: I.e. to do an initial interaction using the HTTP 1.x
protocol to retrieve some sort of RTSP-like description document [2]. The
media type and content of the retrieved representation then instruct the
client on how to retrieve that actual stream or content to be presented. I
believe that the Real system does this, at least in certain cases. Note
that this approach can in principle provide a digital signature to ensure
that content eventually retrieved from a P2P system is in fact what was
originally supplied for the resource, which is one way to attack the
social/trust issues raised above.
The above approach has the drawback of requiring an actual HTTP-protocol
interaction as a preliminary to using the P2P, streaming, or other
protocol to retrieve content. We should consider whether it is
appropriate, and if so under what constraints, to directly serve an
http-scheme resource using a non-HTTP protocol.
New protocols such as P2P could be designated as HTTP x.y (see immediately
below).

Suggestion: You need to make clearer the long term value of the
http-scheme information space.
A: The http-scheme information space has great value. Pages around the
world are deployed with links using that space. As Tim pointed out, one
of our options is to eventually designate some particular new protocols,
perhaps a P2P protocol, as HTTP x.y. When we deploy new protocols to
support that information space, many of those links will continue to work
(and Noah thinks that exactly this sort of reasoning is what we have to
explain in a finding...why exactly does naming the protocol HTTP x.y make
it more likely that such links will resolve? We seem to be connecting the
family of protocols to the scheme. If so, then somewhere in the
architecture we should say that.)

Q. (from Henry) The finding talks a lot about gateways. Are they in fact
widely used?
A. Yes.

Q (from Noah) Are operations such as GET/POST inherent in the scheme or
the protocol?
A. (from various) No consensus. (Noah thinks this is another area where
getting a straight story to tell in a finding might be useful. Then, if
we want to serve http-scheme resource suing some P2P protocol, we'll have
a clean story on whether that protocol needs to support GET/POST with
HTTP-protocol semantics.)

I hope the above at least convinces everyone that I got a lot of useful
feedback from our recent discussions. When I return from vacation, I'll
try and factor the main points into a revised draft. Thank you.

Noah

[1]
http://www.w3.org/2001/tag/2005/06/14-16-minutes.html#item013"http://www.w3.org/2001/tag/2005/06/14-16-minutes.html#item013
[2] http://www.rtsp.org/
[3]
http://lists.w3.org/Archives/Public/www-tag/2005Jun/0027.html"http://lists.w3.org/Archives/Public/www-tag/2005Jun/0027.html
[4] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0036.html

--------------------------------------
Noah Mendelsohn
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------

Received on Tuesday, 28 June 2005 16:54:01 UTC