Semantics of compositionstart/end

Hello,

I'm David Chan. I work for the Wikimedia Foundation, where I'm the primary
person responsible for IME support in VisualEditor, the contenteditable
editing surface used in Wikipedia. I've been reading the list archive to
get up to speed with the current situation.

I want to mention one issue we've had to design around: even outside a
compositionstart/compositionend event pair, it is *still* possible the IME
is in the process of uncancellably modifying the DOM. (See below for
specific IME examples).

That's a shame, because selection/content changes can interrupt the IME's
action and cause unwanted "corrupt" commits of candidate text. Before
performing a fix-up, an event listener first needs to know it is not
running "during a dead key sequence or while IME is active" (in the words
of the Input Event spec), but in fact being outside compositionstart/end
implies nothing, because:

- The IME may be active
- It may be uncancellably modifying the DOM
- There may be uncommitted text
- Changing the selection/the node text may interrupt the user's typing

I'm unsure how this affects the Input Event spec, because I don't know what
the browser vendors have agreed to (sorry I'm so late coming into this
process), but I think it might have implications about beforeevent
cancellability.

Apologies if all this has been discussed already!

DETAILED IME EXAMPLES

1. When you commit text in ibus-table IMEs on Chromium, you see a
compositionend but the IME is still uncancellably modifying the DOM. The
exact event sequence (all in a single tick) is:

- send compositionend
- remove the candidate text
- send input
- send a fake keyup (keycode 229)
- send a fake keydown (keycode 229)
- insert the committed text
- send input
- send a fake keyup.

This clear-then-commit behaviour seems to come from two separate events
emitted by the IME itself inside a single function call:
https://github.com/acevery/ibus-table/blob/2787988857cf551af53ca442c52da3e051825cb4/engine/table.py#L1482

Therefore I guess it would be infeasible for the browser to wrap both
changes in a single compositionstart/end pair - the browser cannot predict
the second change will come. But it could ensure that both changes are
wrapped in (different) compositionstart/end pairs.

2. Windows 8 Korean on IE11 gives you a compositionend when there is still
uncommitted candidate text. This is because several keystrokes form a
single JavaScript character (Korean syllable), and a character gets
committed once the IME knows you have started forming the *next* character.
The exact event sequence when you press that key down is:

- send a fake keydown (keycode 229)
- commit previous character; insert next character's uncommitted candidate
- send compositionend
- send compositionstart
- select the uncommitted candidate

There are many more examples like these in the VisualEditor IME unit tests,
which contain logged event sequences emitted by various IME/browser/OS
combinations:

https://github.com/wikimedia/VisualEditor/tree/master/tests/ce/imetests/
https://github.com/wikimedia/VisualEditor/blob/master/demos/ve/eventLogger.html

Whether you're inside a pair of compositionstart/end events reveals very
little about the IME state - it depends on the workings of that particular
IME/browser combination. It is fairly common that:

(1) One keystroke triggers several compositionstart/end pairs
(2) Uncommitted candidate text exists outside a compositionstart/end pair
(3) Uncancellable DOM changes occur outside a compositionstart/end pair

Many thanks,
-- 
David

Received on Thursday, 5 November 2015 09:31:23 UTC