Re: 2 Proposals for Minimum Viable InputEvent from Alexandre Elias on 2017-02-14 (public-editing-tf@w3.org from February 2017)

From: Alexandre Elias <aelias@google.com>
Date: Tue, 14 Feb 2017 06:17:04 +0000
To: Piotr Koszuliński <p.koszulinski@cksource.com>
Cc: Johannes Wilm <johannes@fiduswriter.org>, Chong Zhang <chongz@chromium.org>, Dave Tapuska <dtapuska@chromium.org>, "public-editing-tf@w3.org" <public-editing-tf@w3.org>
Message-ID: <CADeTeo54vdBYn1Wy_KW5SN7+2eLLTSx97ks8oY66pYrO498Zvw@mail.gmail.com>
OK, we replaced the original document we sent out with a much more detailed
"merged proposal": https://docs.google.com/docume
nt/d/1yPZEkHl_WOPjVeilZjE1XmlyjLCe-uS7THI0SboyrMM/edit#.  It's essentially
the same thing that we proposed originally, but with a few tunings and a
lot of clarification of everything that might be ambiguous.  I hope it sums
up the discussion so far and will help answer everyone's remaining
questions.

> Basically it just seems to come down to 3 proposals
[...]
> If 2 or 3 are fine for everyone in each case, we can probably decide on
something tomorrow. If 1 is still on the table for any of these, we will
probably need more time to think this through and we'll probably also need
more meetings about these proposals.

At this time, Chromium is only willing to go with 1, 1, 1.  None of what
you mentioned in 2s and 3s amount to things that we're willing to ship.

Also, personally, although I'll attend tomorrow's 9AM meeting, I don't have
time for a long battery of in-person meetings on this topic.  That doesn't
seem productive: the relevant questions about our proposal have mostly been
brought up and answered in writing already, and the other browser vendors
and editor authors haven't demanded extensive in-person discussion.  I'd
prefer to continue to communicate mostly by writing on this subject, and
devote in-person time to more forward-looking topics like "invisible
textbox" support.

And keep in mind that what we're proposing is a clean subset of stuff that
was already in the spec -- shipping it doesn't rule out implementing the
original spec later, and code written to our subset would even likely work
on another browser if one were to implement the original spec.  Our
proposal, by design, strips out all controversial behaviors and has almost
no serious risks in it -- so I don't see why it might need to be blocked on
that much discussion.  Most of your objections have been of the form
"unlike the original spec, this doesn't solve X use case" or "this
admittedly solves Y use case, but makes it very hacky/inconvenient to do
so", but objections like that don't need to be blockers -- the goal of
shipping this is to make life incrementally better for JS authors, not to
achieve a utopia that makes it trivial to write a rich text editor.



On Mon, Feb 13, 2017 at 12:46 PM, Piotr Koszuliński <
p.koszulinski@cksource.com> wrote:

>
>> I believe this use case can only be solved by the long-term "full
>> model/view/controller" IME solution yosin@ and I ultimately prefer,
>> whereby IME talks to an invisible textbox "model" and JS maps it to any
>> DOM/canvas/WebGL.  You need 100% separation between plaintext distillation
>> and user-visible representation to do things like this, and none of the
>> proposals on the table address it.
>>
>
> Just a quick question – I was worried about that "full mvc" IME solution
> mentioned earlier. Hopefully, you don't mean to force normal RTE editors
> (which renders and reads from the same cE element) to use that invisible
> textbox? :) AFAIR Google Docs is using such a solution, but this sounds
> really bad. It may solve some issues, but triggers an entire spectrum of
> issues with which we don't have to deal currently like positioning of that
> textbox, rendering IME's controls and text styling.
>

We won't "force" anyone to do anything -- in particular, we won't break any
JS that's already widely deployed in the wild.  It's more a question of
where we would prefer to devote our effort in new specwork.  I agree there
is "an entire spectrum of issues" about invisible textbox, and we could be
having future meetings about solving those issues instead of being stuck on
cancelability.  I'm certain that the invisible textbox approach is feasible
and theoretically clean (if UAs were to add a few new features around focus
and selection in particular), whereas cancelability is a bottomless well of
uncertainty and chaos that doesn't seem as productive to spend time to
discuss.

I'm also willing for us to spend some effort supporting the "DOM diffing"
category of editors, but would again prefer to focus on simple and clean
improvements with minimal risk and high cost/benefit (such as adding
getTargetRanges() to "input" event).


>
>>
>> > However, there's one nasty case which may still require hacking if the
>> event isn't cancelled. It's how the editor can delete the content itself
>> when the user is typing over a non-empty selection. We can't let the
>> browser do that because we wouldn't know how to convert those changes to
>> the model (and sometimes it might simply be impossible).
>>
>> Understood.  I'm not a fan of the solutions to this issue proposed so
>> far, but how about this: what if we extend "getTargetRanges()" to work on
>> the "input" event (not only "beforeinput") to indicate what range of text
>> has just been inserted?  That should provide the same information as the
>> "deletion/insert" split option, without adding one of those "phantom"
>> events I dislike.  Knowing precisely what range has just been inserted,
>> your JS could then remove it from DOM, apply your deletion reconciliation
>> algorithm, and then reinsert it.
>>
>>
>>
> I think that fixing the DOM on the input event is too late. Taken that we
> must not touch the DOM if composition takes place (because this breaks
> composition)
>

I actually don't share that "taken" assumption.  I consider it feasible for
Chromium to preserve ongoing composition across an arbitrary JS DOM change
provided that the "plaintext distillation" perceived by IME is not altered
by that DOM change.  I personally would be willing to advocate internally
that we commit Chromium to such a behavior (this is just my off-the-cuff
opinion here though, and I can't speak for the other browser vendors
either) and I would prefer that to what you're proposing.

The key thing I want is -- at least at one point in time -- a "known
correct" state of the textbox that matches the IME's expectations.   That's
because Chromium needs to notify the Android IME appropriately whenever the
textbox differs from its expectation.  I am very wary of proposals which
lack a strong guarantee of any point reaching the "known correct" state,
yet which also *intend* to reach that state, just without providing proof
to the UA.  IME cancelability and your current proposal have that problem
in common.

On the other hand, if we always allow the UA to execute the "input" event
without interference, then we have obtained a "known correct" version of
the textbox (as mutated by the latest IME command) -- and then, if JS
interferes with the DOM after that point in time, the UA merely needs to
verify that that this doesn't meaningfully change the plaintext
distillation from the "known correct", which is relatively simple and easy
to reason about.

You said that you don't want (and/or can't) to split the beforeinput
> beginning the composition into "delete" and "insert" parts. So, how about
> the alternative option in which the JS editor will, based on the
> beforeinput event, prepare the DOM for "x" insertion.
>
> In this case, the above scenario would look like this:
>
> 1. Initial content: <h1>fo[o</h1><p>b]ar</p>.
> 2. User presses "x". Let's say this should trigger composition start.
> 3. Browser fires beforeinput.
> 4. The JS editor checks that the selection isn't empty and triggers its
> internal deleteContent( selection ) algorithm which changes its internal
> model
>     * from: <heading level=1>fo[o</heading><paragraph>b]ar</paragraph>
>     * to: <heading level=1>fo[]</heading><paragraph>ar</paragraph>
> 5. The change to the model is rendered to the virtual DOM and then
> straight to the real DOM resulting in: <h1>fo[]</h1><p>ar</p>
> 6. The beforeinput wasn't cancelled (because it couldn't be), so the
> browser continues its work. It calls its internal delete content mechanism
> (but the selection is empty so nothing happens) and then it inserts "x".
> This results in: <h1>fo[x]</h1><p>ar</p>. Composition is started.
> 7. The input and compositionstart events are fired.
> 8. One one of them, the JS editor inserts "x" to its internal model, which
> results in: <heading level=1>fo[x]</heading><paragraph>ar</paragraph>.
> This renders to the virtual DOM, but the real DOM doesn't have to be
> touched because it's identical.
>
> The above algorithm would be acceptable from our POV. We just need to be
> able to act on beforeinput and to somehow learn that we need to insert "x"
> to our model.
>
> Do you think that you the browser engine could allow us to do that?
>

No, I don't think so.  The problem is that if you do that, the UA has no
reasonable way to prove that the combination of the JS mutation + the UA's
default behavior winds up with the same plaintext distillation result as
the UA's default behavior alone.  *You* know it will, but the UA doesn't --
the only way to make the comparison would be some crazy mechanism like
speculatively doing and undo the IME action before every beforeinput
handler, just to generate a "known good" baseline which would remain unused
in the majority of cases.  Lacking such proof, the UA would be forced to
conservatively assume that the textbox's plaintext distillation may have
changed, and tell the IME to cancel the active composition.

On the other hand, if the order of actions is inverted, such that the UA's
default behavior happens first, then the UA *does* have the information
available to prove it.
Received on Friday, 17 February 2017 02:29:28 UTC