Re: Units of measurement and scrolling in actions from Simon Stewart on 2017-01-12 (public-browser-tools-testing@w3.org from January to March 2017)

From: Simon Stewart <simon.m.stewart@gmail.com>
Date: Thu, 12 Jan 2017 15:26:46 +0300
To: David Burns <dburns@mozilla.com>
Cc: public-browser-tools-testing <public-browser-tools-testing@w3.org>
Message-ID: <CAOrAhYHWrm37y6DkcdmECYz47h2ZA2MpkdpShD1AhTC8sUVyuw@mail.gmail.com>
Again inline :)

On Thu, Jan 12, 2017 at 1:41 PM, David Burns <dburns@mozilla.com> wrote:

> tl;dr; The only size we can use is CSS pixels as that is what browsers
> know. More answers inline.
>

I agree, and I agree that we should change the Action commands to use these
(rather than client coordinates) as well.


> David
>
> On 12 January 2017 at 09:26, Simon Stewart <simon.m.stewart@gmail.com>
> wrote:
>
>> Hi,
>>
>> TL;DR: should units for size and distance be consistent throughout the
>> spec? And if a scroll is needed when using Actions, how should that be
>> specified, or is it implicit?
>>
>> Long version:
>>
>> While reviewing James's PR for pointer events
>> <https://github.com/w3c/webdriver/pull/495>, I noticed two things.
>>
>> 1/ We have no way of knowing the size of the viewport, or where an
>> element is within the viewport.
>>
>> 2/ We have no way of knowing the size of an element within the viewport
>> in anything other than CSS reference pixels.
>>
>> 3/ We have no text on how to handle the case where an element is outside
>> of the viewport.
>>
>> In order to help give some context to the discussion, consider three
>> separate use-cases:
>>
>> A/ A user expects "get element rect, calculate half the width, perform
>> pointer move to element, perform second pointer move by that half width" to
>> be the same as "get element rect, calculate half the width, perform a
>> single pointer move with element and xoffset of the half width" to cause
>> the pointer to end in the same place.
>>
>> B/ A series of interactions begins starting from element A and ending at
>> element B, who's final x/y location is determined algorithmically and isn't
>> known in advance. Until the interactions begins, element B is not within
>> the viewport, and the size of the viewport is unknown --- on local test
>> runs, the display is 2880 x 1800, but when running on a "webdriver as a
>> service" provider, the screen size is 1024 x 768.
>>
>> C/ A user wants to start the pointer move in one frame, and end in
>> another, performing a drag of (for example) an email into (for example) a
>> folder of a web-based email app.
>>
>> Breaking these down, "a" and "2" show that we have a problem with the
>> units used for specifying distances and sizes in webdriver. Most of the
>> time, it's CSS reference pixels, but in Actions, we flip to using locations
>> within viewports. We don't provide a mechanism to translate between the
>> two. It would feel that consistently using CSS reference pixels throughout
>> would be simpler for an end-user to understand, though more complex to
>> implement at the remote end (since you now need to convert from reference
>> pixels to a clientX/Y)
>>
>> However, I'm not sure whether "c" would complicate using css reference
>> pixels: what if a user had changed the zoom level in one frame but not the
>> other? Should we even allow drag motions between frames?
>>
>
> At TPAC Shenzen we decided that this was not a use case (C)  we were going
> to support. Notes at https://www.w3.org/2013/11/11-
> testing-minutes.html#item12. There are possible security sandboxing
> issues. There is also the issue doing an implicit switch_to_frame to the
> new frame, doing the relevant look up for the element and what to do if its
> stale. When the Action Chain is finished, which frame do you end on? Since
> there implicit frame switch people could be expecting either case and this
> could lead to a footgun.
>

Heh. This was easier with the APIs from the selenium project where you
could use one of a set of coordinate systems. I'm okay with making this use
case impossible for level 1.


>
>> It also seems clear that we need some mechanism to cause a scroll to
>> happen mid-way through a series of (pointer) actions. We could do this
>> implicitly (which would make "b" possible), by asking someone to specify a
>> scroll action (from the null input device?), with a delta and an optional
>> target element (which also makes "b" possible), or by returning some kind
>> of error stating that scrolling would be necessary to complete the action
>> (which may make "b" impossible).
>>
>>
> In Shenzen, we said we didnt need such an API (however the new Actions API
> wasnt on the table at that point). We did, however, in SF
> https://www.w3.org/2014/02/25-testing-minutes.html talk about scrolling
> to elements for different commands and how it would be good to turn this on
> and off. Perhaps this needs to be either an Actions "task" or it needs to
> be a property in the actions blob sent over the wire. I don't mind either
> way.
>

Why not both? Seriously. You're right that we've discussed being able to
turn this behaviour on and off, and that feels natural, but if it's off we
need a mechanism to force a scroll to occur.


> My ideal outcome as a user would be:
>>
>> * All distances and sizes are always given in CSS reference pixels.
>> * Scrolling happens thanks to a "scroll action" added to the events, or
>> when a user specifies a target element in another action.
>>
>> A painful but possibly workable solution would be:
>>
>> * Provide a mechanism to get the current viewport size.
>>
>
> Within the Actions commands? Why can't we just use #executeScript for this?
>

Because there's no single API to call. "window.clientHeight/Width" doesn't
always provide completely accurate results, and you may need to use
window.innerHeight/Width, and disappearing scroll bars make values
inconsistent. We also want the size of the viewport in CSS reference pixels
as we've agreed.

>
>
>> * Provide a mechanism to get the size of the currently active frame in
>> the viewport.
>>
>
> Again, where would we put this command and why can't we use #executeScript?
>

Same reason.


> * Add additional properties to "get element rect" to return the client
>> x/y/width/height of the element, assuming that it was scrolled into the
>> current viewport.
>>
>
> It already returns that information. It doesnt return viewport positions,
> unless you are using #executeScript and using the JS
> element#getClientBoundingRect()
>

It gets the size of the element rect in CSS reference pixels, which are
different from the coordinate system used in the existing Action commands.
If, as you agree, we switch to consistently using CSS reference pixels
throughout the spec, this problem isn't a problem.


> * Provide a scaling factor for converting between CSS reference pixels and
>> client position
>>
>
> Historically, we havent supported people changing the scaling in their app
> and told them they need to fix it. See IEDriver as an example.
>

I know, and it's a constant source of bugs and gripes from users. That's
because the maths has been horrible. If we're consistently using CSS
reference pixels, I think we should be fine no matter what the zoom level.


>
>
>> * Make local ends do the maths for users
>>
>
> Fine by me.
>

I think it's error prone, and means we need to know all sizes and positions
of all elements before we do anything, but, like I said, it's "painful but
workable".


> * Make scrolling explicit.
>>
>
> Fine by me
>

As I said above, I think there's a strong case for having both implicit and
explicit scrolling be available.


>
>> The former seems simpler from a local end PoV, but I'm unsure how much
>> work it would take at the remote end.
>>
>> I've come round to the idea scrolling should not be implicit, since it
>> makes use case "c" a PITA to implement.
>>
>> Thoughts?
>>
>> Simon
>>
>
Simon
Received on Thursday, 12 January 2017 12:27:20 UTC