Re: The ever contentious capabilities for new sessions from James Graham on 2016-09-13 (public-browser-tools-testing@w3.org from July to September 2016)

From: James Graham <james@hoppipolla.co.uk>
Date: Tue, 13 Sep 2016 22:14:42 +0100
To: public-browser-tools-testing@w3.org
Message-ID: <99681888-847a-7381-14ae-af40a379c633@hoppipolla.co.uk>
On 12/09/16 21:55, Simon Stewart wrote:
> Hi,
>
> We spend an awful lot of time at F2F sessions on capabilities, but I
> think Jason Leyba's current suggestion nails almost all the points that
> have been raised in meetings and in person:
>
> https://github.com/w3c/webdriver/pull/327
>
> Notably:
>
>   * Address the desire for simple processing of capabilities by end
>     nodes, with exact matches only.
>   * Makes it possible to describe several different possibilities.
>   * Has a different set of blob keys, meaning that the protocol
>     handshake between OSS, the original spec text, and the new spec text
>     can be done unambiguously (esp. if end nodes hold to Postel's Law)
>   * Makes an effort to reduce data being sent across the wire through
>     the use of the "required capabilities" being merged with "first match".
>
> The biggest downside from my point of view is that this is hard to make
> 100% backward compatible with the widespread use of selenium, but we
> could handle iterating over the values on the local end until we ship
> Selenium 4.

I think this proposal looks like an improvement over the existing 
design. However I have a number of concerns:

* I think continuing to describe these as "capabilities" is misleading 
because the name implies semantics that are only relevant to a subset of 
the features (particular around browser selection and routing). Things 
like timeouts are pure configuration. We should use a more neutral term 
like "parameters".

* I presume the point of passing on browser-selection parameters to the 
browser itself is to enable the browser to re-match on these parameters 
to select only the required subset of parameters, without requiring an 
intermediary node to alter the message. But I think the design here has 
two issues. One is that it is not, in general, cheap to tell which 
version of a browser will be run; to do this from a proxy one needs to 
actually launch the browser and parse out the version number string. 
This seems relatively complicated and I would like to avoid it if 
possible. Why do routing intermediaries need special consideration in 
terms of not altering the parameters? The other is that the proposed 
structure seems rather non-general. If I want to specify something large 
like a bas64-encoded profile in a way that it only appears in the 
message once, but where it applies to > 1 but not all of the firstMatch 
parameter sets that isn't possible.

* It is unclear to me that hard-failing on unrecognised parameters is 
the most backwards compatible thing. In particular I'm wondering about 
the case where a browser introduces a new parameter related to something 
like e.g. logging which is basically always optional. In the scheme 
described that would require duplication for no obvious benefit. Having 
said that there is no recursive validation, so it seems one could always 
put browser-specific configuration under a single parameter and 
implement whatever semantics inside that, without violating the letter 
of the spec.

* Some details of the way the spec is set up don't make sense. This is a 
holdover from the existing text, but if we are revamping this section we 
should also fix the major structural issues e.g. the table that has 
normative text that is not actually referenced from any section.

* It may just be that I'm bad at reading the spec as a diff, but it's 
not clear that the algorithm as written actually does the right thing. 
It seems like every option in matchFirst is tried and the value of the 
last is used, irrespective of anything. Apologies if I'm horribly 
misreading this.

So I think a design similar to this that I would prefer is:

{
"routing": [
   {"browser": "firefox",
    "platform": "linux"},
   {"browser": "firefox"},
   {"browser": "chrome"},
   {}
],
"settings": [
     {"timeouts": {"script": 30000},
     {"match": {"browser": "firefox", "version": 49},
      "firefoxOptions": {"prefs": {"dom.disable-open-during-load":false}}
     },
     {"match": {"browser": "firefox"},
      "profile": <base64String>
     },
     {"match": {"browser": "chrome"},
      "binary": "/usr/local/chrome"}
   ]
}

For any intermediary that did routing this would express a preference 
for Firefox on Linux, followed by Firefox on any platform, followed by 
Chrome, followed by anything.

For browser settings, the options that match would be cumulative, so any 
browser would set the script timeout to 30s, any firefox would use the 
same base profile, firefox 49 would set a specific pref, and Chrome 
would use a specific binary. For matching with the version number one 
would have to use the specified binary, if any so e.g.

{
"settings": [
   {"timeouts": {"script": 30000}},
   {"match": {"browser": "firefox", "version": 49},
              "binary": "/home/user/firefox"}
   ]
}

running in firefox would only use the /home/user/firefox binary if that 
binary was a firefox 49 binary (if intermediary nodes could be required 
to edit the new session payload we could sidestep this complexity by 
requiring that they reduce the settings to a list of length 1 with no 
match clauses representing only the things that are known to work. But 
that does have the problem that one can't send the same message 
irrespective of whether a routing intermediary is present).
Received on Tuesday, 13 September 2016 21:15:08 UTC