Re: Actions questions from James Graham on 2016-05-09 (public-browser-tools-testing@w3.org from April to June 2016)

From: James Graham <james@hoppipolla.co.uk>
Date: Mon, 9 May 2016 19:05:47 +0100
To: public-browser-tools-testing@w3.org
Message-ID: <5730D17B.7070005@hoppipolla.co.uk>
On 09/05/16 17:40, Andreas Tolfsen wrote:
> On Fri, 29 Apr 2016, at 16:56, James Graham wrote:
>> For an input device like a pointer, the pointer can clearly only do one
>> thing at a time. For a keyboard that is less true. Should it be possible
>> to have multiple concurrent action chains corresponding to the "same"
>> keyboard?
>
> Because each input device needs to be uniquely identified with an ID,
> for example to maintain modifier key state, we shouldn’t allow multiple
> action chains for the same device.
>
> However, in practical terms this doesn’t matter as the desired output
> are just DOM events. We need to disallow it for race condition reasons
> in the WebDriver abstraction we’re creating, so that the same keyboard
> references in two concurrent actions do not perform a keyUp("shift") and
> keyDown("shift") operation in the same tick.

I don't think that's true. I imagine that the concurrency isn't "real" 
i.e. if you get three actions to perform in the same tick, they won't 
generate events in a random order, but in a well-defined order e.g. 
top-to-bottom. So if you happened to send

{actions:
[
[{id: 1, type: key, actions: [{type: keyDown, value: a}],
[{id: 1, type: key, actions: [{type: keyUp, value: a}],
]
}

you would always get the keyDown event before the keyUp event.

It may be that we don't care about being able to chord in this way, and 
that one-after-another in separate ticks is good enough, in which case 
disallowing this would be OK.

>>> The client bindings set the focus on the element and then the rest is
>>> assumed that you are working on the active element. See
>>> https://github.com/SeleniumHQ/selenium/blob/master/py/selenium/webdriver/common/action_chains.py#L156
>>> as an example
>>
>> OK, so there is no way to supply an element with the action, or change
>> the element mid action chain without dispatching actions that will do so.
>
> What David says here is not strictly true. The Java Actions API allows
> you to move the mouse to a given web element.

So what's the desired functionality here?

>>>      Fundamentally I am unclear what key event model people want to
>>>      standardise. I have seen lots of conversations around specific
>>>      keyboard layouts and IMEs and so on. At the same time many platforms
>>>      now don't present physical keyboards, and the kind of interaction
>>>      you get from something like Swype doesn't seem possible to model in
>>>      the current specification. I think interoperability is possible
>>>      through a model in which key actions generate (well-specified) DOM
>>>      events, and above-browser parts of the system (compose key, soft
>>>      keyboard, IME, etc.) are abstracted away. Is there a strong reason
>>>      that this simple model is not good enough?
>>>
>>>
>>> I think if we pick something from
>>> https://www.w3.org/TR/uievents-code/#keyboard-common-layouts we can then
>>> get what we need. Seeing as we have
>>
>> [...]
>>
>> It's not yet clear to me exactly what effect the choice of keyboard
>> layout has if you can send any codepoint as the 'key' to press. Maybe
>> someone can fill me in?
>
> This is a grey area. Selenium assumes a US keyboard layout, in the sense
> that it only converts [a-z] to [A-Z] when shift is pressed.
>
> Perhaps we need to be clearer about this?

OK, so I read some more here, and the issue seems to be that we need to 
populate various event various attributes related to the physical 
location on the keyboard and the key that was pressed (e.g. 'a' if the 
character is 'A'). So a simple approach would be to start by specifying 
a US keyboard layout for the purposes of that translation and, in some 
future revision of the spec, making it possible to specify the keyboard 
layout somehow (either as a global or as part of the actions API). 
However this still leaves several open questions, like how to 
translation an action with value: ☃ into a key event.

>>>      == Pointer Actions ==
>>>
>>>      It seems like pointer actions are always specified relative to an
>>>      element? Is this correct, or should it also be possible to specify
>>>      relative to the viewport?
>>>
>>>      There is an open issue about dispatching touch events and other
>>>      kinds of events. How will this be handled?
>>>
>>> I would love for someone to have some thoughts on this. The issue I can
>>> see is that with some devices, like a Surface, you can have touch events
>>> when using the screen but you can also have a mouse, which ones should
>>> we send if we could detect both. Since Touch/Pointer is a minefield it
>>> would be great to get this nailed down.
>>
>> Should they simply be different action types? Possibly there would have
>> to be a way to signal that a particular browser / device didn't support
>> a particular class of actions.
>
> I’d like us to have a discussion about this at the F2F, along with the
> above question about keyboard layouts, or in general how to handle
> locality of a device.
>
> That said, I do remember we discussed the example of a mouse with more
> than two or three buttons. Then we decided that even if the attached
> mouse device only has three buttons, requesting button 7 or 8 to be
> pressed, would not be a problem as the driver shouldn’t have to know the
> locality of the physical devices available.
>
> After all, the whole point here is that they are being emulated. This
> also applies to the key input case you mentioned earlier, that you can
> in fact give it _any_ unicode codepoint.

I think the physical devices attached to the system should be 
irrelevant; we should be limited only by what can be expressed in DOM 
APIs. But some of those APIs make sense in the context of a specific 
physical device (e.g. the example with keys mentioned earlier).
Received on Monday, 9 May 2016 18:08:30 UTC