Re: Testing WebDriver in Web Platform Tests from Sam Uong on 2017-01-26 (public-browser-tools-testing@w3.org from January to March 2017)

From: Sam Uong <samuong@google.com>
Date: Thu, 26 Jan 2017 21:08:57 +0000
To: Mike Pennisi <mike@bocoup.com>, Andreas Tolfsen <ato@mozilla.com>, public-browser-tools-testing@w3.org
Cc: Maja Frydrychowicz <maja@mozilla.com>
Message-ID: <CAKhnsbEicsrdtwL+QTSTN7gwB-UFz1+X4dMZmpFKrpciZVKu_Q@mail.gmail.com>
I wouldn't be totally against having send_command expose a status code, but
is there a reason not to just assert that an exception is raised if we
expect one? Keep in mind that when a test calls a command, that command is
not always the command under test, so convenience functions are still
useful.

For example, navigate command is used by many different tests to load a
test page, and an error status *is* an exceptional case. Are the "get
title" tests (and other tests) really expected to assert 200 every time it
navigates to a test page? Execute script is similar - this is used
everywhere and it might get tedious checking the http response code every
single time.

The submodule of a submodule thing also sounds tedious for those of us who
don't have everything checked out from a single tree. How much work is
involved in having wdclient sit closer to the tests themselves?

On Thu, Jan 26, 2017 at 3:31 PM Mike Pennisi <mike@bocoup.com> wrote:

>  > Only after testing all these things, would I consider using an
>  > abstraction API to run a set of data tests to check that the API is
>  > behaving as expected.
>
> Agreed!
>
>  > As I showed in my last email, it is possible to test all these things
>  > using the `http` fixture to send malformed data and the
>  > `session.send_command` to send raw data structures to arbitrary
>  > endpoints.
>  >
>  > It is true that we _are_ trading away some specificity for convenience
>  > when abstracting away details, but the bigger question here is if this
>  > is an acceptable trade-off as opposed to writing out every JSON literal
>  > and its expected response?
>
> Yes, there is definitely a trade-off here. It's subjective, and I'm
> willing to
> be flexible. That said, I do feel like the `http` fixture is a little too
> low-level for the purposes we're discussing here because it requires
> explicit
> handling of session information. At the same time,
> `HTTPWireProtocol#send` is
> too high-level because it reacts to non-200 HTTP response codes by
> raising an
> exception (even though this is not an exceptional case when that is the
> behavior under test).
>
> Ideally, I would like to use a version of `send_command` that includes
> the HTTP
> response status code in its return value. To demonstrate using this API,
> I've
> written tests for the "Get Title" using such a function:
>
>
> https://github.com/w3c/web-platform-tests/compare/master...jugglinmike:webdriver-get-title
>
> This required some additional utilities which would be obviated by
> support from
> the library.
>
>  > In the current project structuring, maintenance is a bit of a burden
>  > because fixing a test bug involves submitting a patch to one project
>  > and then updating a submodule. In this way, the "web platform tests"
>  > project is not the "single source of truth" for the protocol--the
>  > expected behavior is spread across multiple locations.
>
> We don’t have this problem at Mozilla since all the source code is
> checked out in a single tree, letting us make all the changes at once.
>
>  > However, I’ve felt this pain before and I agree the situation is not
>  > great.
>
> I spoke with James about this on IRC, and it turns out that he already had
> plans for a similar restructuring. He's currently prioritizing the effort
> to
> merge in the css-wg tests (since it also concerns refactoring shared
> infrastructure).
>
>  > But it is possible to create config.py file (or some other name I’ve
>  > forgotten) that pytest will pick up automatically, so moving the
>  > remaining fixtures into the /webdriver folder makes sense.
>
> I stumbled on this just today--it's `conftest.py`.
>
>  > > - Formally define a policy in `webdriver/README.md` which states that
>  > > the command under test must not be issued by the library code
>  > > (meaning library use is limited to streamlining extraneous details)
>  >
>  > I don’t understand what problem you think you are solving with policy.
>  >
>  > Each command needs to be adequately tested, which means all the formal
>  > aspects of a command’s input and output must be asserted. This
>  > involves type checks, status codes, bounds checks, data coherence, &c.
>  > If a test does not conform to these standards, it should be rejected in
>  > review.
>
> That final suggestion was worded a little too strongly. As written, it
> would
> seem to suggest that each test must express the critical command(s) by
> writing
> bytes to a TCP buffer! What I should have said is that we will not
> accept tests
> that use a high-level API in the code path under test. Based on your
> earlier
> statement ("Only after testing all of these things..."), I believe we are
> in
> agreement on this, but please let me know if I am mistaken.
>
> Anyway, would you be open to a patch for `wdclient` that introduces a
> method
> like the one in that demonstration?
>
> Mike
>
>
> On 01/26/2017 09:53 AM, Andreas Tolfsen wrote:
> > Mike Pennisi <mike@bocoup.com> writes:
> >
> >> My concern about the use of any library (not wdclient in particular)
> >> is that it increases the complexity of the code under test. This can
> >> make the tests harder to understand overall, as critical details are
> >> abstracted away by a convenient API.
> > In general I agree with this sentiment.  But there is also a point to
> > be made that a loosely-coupled local end-like library can make it less
> > challenging and tiresome to write tests that involve interaction with
> > other commands that are not under test.
> >
> > When I use the term “local end-like” I mean that wdclient is not your
> > average WebDriver client you will find in the wild.  There are many
> > conveniences it does not provide, with good reason.
> >
> >> I prefer to use a binding library when testing applications because
> >> in those contexts, I'm asserting the behavior of business logic, and
> >> details about the WebDriver protocol are irrelevant.
> >>
> >> Here though, the WebDriver protocol is itself the test subject. We
> >> should expect tests for (for example) "Add Cookie" to contain an
> >> object definition like:
> >>
> >>       { "cookie": { "name": "foo", "value": "bar" } }
> >>
> >> ...rather than either of the two phrasings currently allowed [1]:
> >>
> >>       cookies["foo"] = "bar"
> >>
> >>       class Wrapper:
> >>         def __init__(self, value):
> >>           self.value = value
> >>
> >>       cookies["foo"] = Wrapper("bar")
> > If I were testing cookies, I would assert that the endpoint behaved
> > correctly, e.g. that it was returning the correct HTTP status code and
> > that the response body was parsable as JSON.  I would test that the
> > body contained the expected fields and that the fields’ values had the
> > correct types.  Then I would check invalid data input and types to see
> > if the expected errors are returned, and finally I would continue with
> > bounds checks of the values.
> >
> > Only after testing all these things, would I consider using an
> > abstraction API to run a set of data tests to check that the API is
> > behaving as expected.
> >
> >> Beyond obscuring the behavior under test, that kind of convenience is
> >> itself susceptible to bugs […] and requires additional maintenance.
> > There is always a chance a bug would be hidden in the abstraction
> > unless you have a good set of protocol-level tests that assert the
> > endpoint, input arguments, and sanity of the response.
> >
> > As I showed in my last email, it is possible to test all these things
> > using the `http` fixture to send malformed data and the
> > `session.send_command` to send raw data structures to arbitrary
> > endpoints.
> >
> > It is true that we _are_ trading away some specificity for convenience
> > when abstracting away details, but the bigger question here is if this
> > is an acceptable trade-off as opposed to writing out every JSON literal
> > and its expected response?
> >
> >> In the current project structuring, maintenance is a bit of a burden
> >> because fixing a test bug involves submitting a patch to one project
> >> and then updating a submodule. In this way, the "web platform tests"
> >> project is not the "single source of truth" for the protocol--the
> >> expected behavior is spread across multiple locations.
> > We don’t have this problem at Mozilla since all the source code is
> > checked out in a single tree, letting us make all the changes at once.
> >
> > However, I’ve felt this pain before and I agree the situation is not
> > great.
> >
> >> - Move the library into the "Web Platform Tests" project
> > I don’t think there is an easy way to do this right now.  wdclient is a
> > submodule of wpt-tools, which again is a submodule of web-platform-
> > tests.
> >
> > You might have to make some changes to wptrunner for it to know where
> > to pick up wdclient from.
> >
> >> - Move the pytest "fixture" definitions from the "wptrunner" project
> >> to the "Web Platform Tests" project
> > The `session` fixture reuses certain resources that are only available
> > to wptrunner, so this cannot be moved.
> >
> > But it is possible to create config.py file (or some other name I’ve
> > forgotten) that pytest will pick up automatically, so moving the
> > remaining fixtures into the /webdriver folder makes sense.
> >
> >> - Formally define a policy in `webdriver/README.md` which states that
> >> the command under test must not be issued by the library code
> >> (meaning library use is limited to streamlining extraneous details)
> > I don’t understand what problem you think you are solving with policy.
> >
> > Each command needs to be adequately tested, which means all the formal
> > aspects of a command’s input and output must be asserted.  This
> > involves type checks, status codes, bounds checks, data coherence, &c.
> > If a test does not conform to these standards, it should be rejected in
> > review.
> >
>
>
Received on Thursday, 26 January 2017 21:09:41 UTC