Re: [hybi] [Uri-review] ws: and wss: schemes

>Jamie Lokier wrote:
>> A HTTP URL does not tell you the type of resource, only where to find
>> _a_ resource.  For example there are places where a user can enter the
>> URL of a CalDAV calendar resource.  The CalDAV protocol is used (over
>> HTTP) to work with that resource, but the URL doesn't say what it is.
>> 
>> The only difference with WebSockets is that it (so far) seems to avoid
>> any descriptive metadata, which means there will still be applications
>> which ask for a WebSockets URL, but when the URL is for a different
>> protocol on top, it'll simply break with undefined behaviour instead
>> of a clean error message or fallback behaviour.
>> 
>> It doesn't matter if you think nobody should do that.  It will still
>> be done anyway - because it's so obviously useful.

Křištof Želechovski wrote:
>  1.  Maybe it is just me but I cannot see how breaking with undefined
> behavior could be obviously useful.  Undefined behavior might as well amount
> to accidentally starting WW III.

You misunderstood; sorry I must have been unclear.

*Breaking with undefined behaviour* is not useful.

Being able to connect to user-specific resources is very useful, even
if there is no way to be sure the resource is using the right
protocol, other than simply trying it.

Think about it:

   - Every GUI email application lets you say which server and port to
     use for POP3, IMAP and SMTP.
   - Every blog and picture posting application lets you say which URL
     to use for posting, even letting you use non-standard ports.
   - Every WebDAV remote file manager lets you say which HTTP URL to
     connect to for WebDAV service.
   - Every CalDAV calender application lets you say which HTTP URL to
     connect to for calender service.

Always, the user can enter bogus information and undefined behaviour
follows.  But the ability for the user to enter it is what makes those
applications useful; that benefit far outweighs the problems.

We like to avoid undefined behaviour if possible, so it's _better_ to
have protocols with some basic, de facto standardised self-checking
before proceeding with something dangerous.  HTTP is good at this when
methods or unusual headers are used for checking, and bad at this when
the protocol consists of just GET or POST.

WebSockets philosophy seems to be to check for WebSockets itself, but
then avoid the problem at the next level, passing it on to the
application to do an initial handshake, which has the same problems as
raw TCP in that there is no _standard_ negotiation handshake.

You can be absolutely sure that as soon as there are useful web
applications using WebSocket, there will be non-browser clients using
the same protocols and the servers won't even know.

For example, imagine a real-time web-based map which shows movement of
vehicles on a map, using WebSockets to get real-time movement updates.

It won't be long before an unrelated third party writes a non-browser
app which queries the server to get the same information, and that
non-browser app will have to implement the same WebSockets
protocol.

>  2.  If we do not want spiders to connect to Web sockets, it seems using a
> special URL scheme is a way to prevent this, and therefore it would be
> desirable.

1. If HTTP URLs are used, spiders will do HTTP GET requests which won't
   be a problem - a WebSockets-only service will just reject the request.

2. robots.txt.

3. It will add some load.  It shouldn't be a lot of load, because
   WebSockets-only URLs would tend to appear in Javascript, where most
   spiders would tend not to follow them.  But remember that some
   spiders are unfriendly.

So I think that's not a strong argument either for or against a
different URL scheme.

-- Jamie

Received on Sunday, 16 August 2009 17:50:03 UTC