Re: [whatwg/fetch] Editorial: make WebSocket use obtain a connection (#1241)

@ricea commented on this pull request.



> @@ -7425,32 +7417,6 @@ fetch("https://www.example.com/")
 </div>
 
 
-<h3 id=websocket-connections>Connections</h3>
-
-<p>To <dfn id=concept-websocket-connection-obtain>obtain a WebSocket connection</dfn>, given a
-<var>url</var>, run these steps:
-
-<ol>
- <li><p>Let <var ignore>host</var> be <var>url</var>'s <a for=url>host</a>.
-
- <li><p>Let <var ignore>port</var> be <var>url</var>'s <a for=url>port</a>.
-
- <li><p>Let <var ignore>secure</var> be false, if <var>url</var>'s <a for=url>scheme</a> is
- "<code>http</code>", and true otherwise.
-
- <li><p>Follow the requirements stated in step 2 to 5, inclusive, of the first set of steps in
- <a href=http://tools.ietf.org/html/rfc6455#section-4.1>section 4.1</a> of The WebSocket Protocol

> @ricea Could you expand on the following comment?
> 
> > Dealing with proxies is awful and Chromium does it completely wrong.

The problem at hand is the following text from RFC6455:

>        If the client cannot determine the IP address of the remote host
>        (for example, because all communication is being done through a
>        proxy server that performs DNS queries itself), then the client
>        MUST assume for the purposes of this step that each host name
>        refers to a distinct remote host, and instead the client SHOULD
>        limit the total number of simultaneous pending connections to a
>        reasonably low number (e.g., the client might allow simultaneous
>        pending connections to a.example.com and b.example.com, but if
>        thirty simultaneous connections to a single host are requested,
>        that may not be allowed).  

Problems:

1. On most networks we *can* determine the IP address, even if we're behind a proxy, However, for privacy reasons we shouldn't look up the IP address unless we already looked it up in the process of determining the proxy to use (ie. for proxy.pac). But AFAIK in Chromium we don't have a concept of "I already looked up this IP address", nor do we have a concept of "check the cache for this IP address, but don't hit the network".
2. Does `MUST assume ... that each host name refers a distinct remote host` imply that we throttle by _hostname:port_ the way we usually throttle by _ip:port_? That's what I've been assuming up until now, but the rest of the sentence maybe implies a completely different throttling strategy?
3. In practice what we do in Chromium is throttle by the IP address of the proxy server. This is completely wrong, but causes surprisingly few problems in practice.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/pull/1241#discussion_r639974524

Received on Wednesday, 26 May 2021 17:33:05 UTC