Re: [hybi] [Uri-review] ws: and wss: schemes from noah_mendelsohn@us.ibm.com on 2009-08-20 (uri@w3.org from August 2009)

From: <noah_mendelsohn@us.ibm.com>
Date: Thu, 20 Aug 2009 12:32:06 -0400
To: Jamie Lokier <jamie@shareable.org>, Ian Hickson <ian@hixie.ch>
Cc: hybi@ietf.org, Mark Nottingham <mnot@mnot.net>, URI <uri@w3.org>
Message-ID: <OF8D5BA760.E7D4B460-ON85257618.0058E88A-85257618.005AD481@lotus.com>
Jamie Lokier writes:

> However it does begin with an HTTP connection and an HTTP request, to
> a server which may be serving other HTTP resources on the same port.

I think the use case that makes an http scheme potentially interesting is:

Some application is using these ws: or wss: URIs.  The URIs wind up in 
documents or in other places where they can be discovered, perhaps because 
that is how the application keeps track of or advertises various web 
socket endpoints.  Now some other agent, perhaps a search crawler, comes 
upon the documents and finds the references.  If they are http-scheme 
URIs, then the crawler will know how to dereference them.  Of course, this 
agent will not go through the upgrade protocol, and will attempt an 
ordinary HTTP GET.  If the server (I.e. the same software that would have 
processed the Web socket upgrade) chooses to, it can respond with metadata 
explaining that this is a Websocket resource, perhaps providing 
information about its purpose and correct use, etc.  The means for 
returning that metadata might be along the lines of [1].  That 
discoverability seems to have some value. 

Interestingly, the fact that Websockets uses an HTTP-compatible initial 
handshake makes this achievable in practice, I think.

Ian Hickson wrote (earlier in this thread)

> On Wed, 12 Aug 2009 noah_mendelsohn@us.ibm.com wrote:
> > 
> > So, here's an example.  First, let's make the assumption thatthere is 
> > an HTTP server at port 80 at "http://wss.example/", 
> presumably run by an 
> > organization that supports the use of wss.  Assuming that the normal 
> > path through the Web sockets client apis does not access 
> this, the HTTP 
> > server will be used only by legacy clients.
> > 
> > Where's the value?  Let's assume that a link to a WS resourcewinds up 
> > in a page somewhere for some reason.  It could be a bug report, 
> > whatever. Now a search engine crawler stumbles on the bug 
> report page. 
> > If we use the wss: scheme, then either the crawler has 
> special knowledge 
> > of WS, or nothing much useful happens.  If we use 
> > "http://wss.example/..... then the crawler sends a GET to 
> that.  Choose 
> > your favorite metadata access mechanism (perhaps [1], maybe RDFa, 
> > whatever), and the crawler has the opportunity to discover 
> "ah, this is 
> > a WS resource", or at least to learn some things about it.  To some 
> > extent that's true with either approach (the crawler at least
> knows it's 
> > got a link in a scheme that's not understood with wss:), but the 
> > opportunities for incremental discovery seem to be 
> significantly greater 
> > with HTTP.
> > 
> > As Dave Orchard points out, these issues were debated in great detail 
> > with XRI came up for consideration at Oasis, and I think it's fair to 
> > say that the starting position of those proposing xri was initially at 

> > least as firm as that of advocates of wss.  I think Dave is right that 

> > at least many of those same people came to believe that an http-based 
> > approach was in fact either better, or at least a reasonable 
> compromise. 
> > You might want to check with them.
> 
> Do you believe that the above applies to Web Socket protocol connections 

> more than it does to telnet or SSH connections? If so, why?

In principle, probably no, or mostly no; the cases are indeed quite 
similar.  It would be nice to have a uniform protocol for a user agent to 
at least find out >something< about such URIs.  That said, ftp and telnet 
have been widely deployed since the early days of the Web.  To some 
degree, those user agents that want to interact with or learn about ftp or 
telnet resources tend to be coded to understand them.

Websockets seem different in two ways, one of which is more or less a 
happy accident:

1) Websockets are new, so there's an awful lot of software out there that 
will just not recognize the new schemes at all.
2) (happy accident) I suspect the choice of an HTTP-compatible handshake 
was made to get through firewalls, but I think it also facilitates the use 
of HTTP for metadata discovery for web socket resources.

I do see both sides of the "use ws:" vs "use http:" question.  Web socket 
endpoints are not documents (or "Information Resources" [2] in the 
language of AWWW), suggesting that new schemes are indeed appropriate; 
using http:://websockets.org or the like is indeed a bit tricky and 
confusing.  HTTP is at best usable for finding metadata about the Web 
socket.  So, I think there are advantages both ways.

FWIW:  the case has been made that ws: and wss: scheme URIs will mostly 
appear in code that is private to individual applications, as opposed to 
being widely published on the Web.  That being the case, I wonder whether 
it would be better to leave these convenient short names free for future 
use, and to choose something a bit longer for Web sockets?  In general, it 
seems a good thing to allocate short names beginning with the letter "w" 
with a bit of care, given that things on the Web tend to be called "Web 
xxxx" (in this case, "Web Sockets").

Noah

[1] http://tools.ietf.org/html/draft-hammer-discovery-03
[2] http://www.w3.org/TR/webarch/#def-information-resource

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Jamie Lokier <jamie@shareable.org>
Sent by: uri-request@w3.org
08/19/2009 08:59 PM
 
        To:     Mark Nottingham <mnot@mnot.net>
        cc:     URI <uri@w3.org>, hybi@ietf.org, (bcc: Noah 
Mendelsohn/Cambridge/IBM)
        Subject:        Re: [hybi] [Uri-review] ws: and wss: schemes


Fwiw, I agree with Mark that we should not be shy about using new
schemes for new protocols that are not HTTP.

Only things which run _over_ HTTP should use be using HTTP URIs, and
WebSockets does not run over HTTP... but:

Mark Nottingham wrote:
> WebSockets is defining a protocol, and so is a very different case. It 
> is definitely not HTTP.

That's correct, WebSockets is not HTTP.

However it does begin with an HTTP connection and an HTTP request, to
a server which may be serving other HTTP resources on the same port.

Quite possibly over port 80, and in the process trip over HTTP proxies.

So it's not really the same as telnet, FTP etc.

Three notable things for which there is no consensus(*) are:

  1. Does the WebSockets really have to use a fresh, new TCP
     connection for every individual instance of the WebSockets object
     created on every web page in a browser to the same site?  Or
     could it upgrade an HTTP connection which has already been used
     for something else?

  2. Since it begins with an HTTP request for upgrade to WebSockets,
     is there any reason for that request not to be to an arbitrary
     HTTP host+port+path by simply trying the upgrade method on the
     appropriate HTTP path at the server?

  3. If a WebSockets connection cannot be established, clients
     will(**) fall back to an equivalent (but bulkier/slower) protocol
     which uses HTTP only.  What URI will they use in that case?  Do
     we leave it to individual application/framework designers, or do
     we suggest a common strategy?

* -  Although it has been asked if the protocol is a "politically agreed"
     foregone conclusion (due to an earlier message) and therefore if
     we are wasting our time with design discussions.

** - As in, even if the spec does not provide this and browsers do not
     provide it, there will certainly be Javascript
     WebSockets-emulation modules and corresponding server-sides which
     do.  We can choose to ignore this and leave the mapping from
     WebSockets URI to HTTP-fallback URI unspecified, or recommend
     a mapping, or they can simply by the same URI given point 2.

-- Jamie
Received on Thursday, 20 August 2009 16:32:55 UTC