Re: Composition, IME, etc. from Robin Berjon on 2014-06-23 (public-webapps@w3.org from April to June 2014)

From: Robin Berjon <robin@w3.org>
Date: Mon, 23 Jun 2014 17:45:35 +0200
To: Ryosuke Niwa <rniwa@apple.com>
CC: "public-webapps@w3.org" <public-webapps@w3.org>, public-editing-tf@w3.org
Message-ID: <53A84B9F.5040808@w3.org>
On 06/06/2014 19:13 , Ryosuke Niwa wrote:
> On Jun 6, 2014, at 7:24 AM, Robin Berjon <robin@w3.org> wrote:
>> In order to handle them you have two basic options:
>>
>> a) Let the browser handle them for you (possibly calling up some
>> platform functionality). This works as closely to user expectations
>> as a Web app can hope to get but how do you render it? If it
>> touches your DOM then you lose the indirection you need for
>> sensible editing; if it doesn't I don't know how you show it.
>>
>> b) Provide the app with enough information to do the right thing.
>> This gives you the indirection, but "doing the right thing" can be
>> pretty hard.
>>
>> I am still leaning towards (b) being the approach to follow, but
>> I'll admit that that's mostly because I can't see how to make (a)
>> actually work. If (b) is the way, then we need to make sure that
>> it's not so hard that everyone gets it wrong as soon as the input
>> is anything other than basic English.
>
> I'm not convinced b is the right approach.

As I said though, it's better than (a) which is largely unusable.

That said, I have a proposal that improves on (b) and I believes 
addresses your concerns (essentially by merging both approaches into a 
single one).

>> If the browser doesn't know because the platform can't tell the
>> difference between Korean and Japanese (a problem with which
>> Unicode doesn't help) then there really isn't much that we can do
>> to help the Web app.
>
> This predicates on using approach b.  I'm not convinced that that's
> the right thing to do here.

No, it doesn't. If the browser has no clue whatsoever how to present 
composition then it can't offer the right UI itself any more than it can 
help the application do things well. I am merely ruling that situation, 
which you mentioned, out as unsolvable (by us).

>> However if the browser knows, it can provide the app with
>> information. I don't have enough expertise to know how much
>> information it needs to convey — if it's mostly style that can be
>> done (it might be unwieldy to handle but we can look at it).
>
> The problem here is that we don't know if underlining is the only
> difference input methods ever need.  We could imagine future new UI
> paradigms would require other styling such as bolding text, enlarging
> the text for easier readability while typing, etc...

I never said that the browser would only provide underlining 
information. I said it can convey *style*. If it knows that the specific 
composition being carried out requires bolding, then it could provide 
the matching CSS declaration. If there is an alien composition method 
that requires red blinking with a green top border, it could convey that.

Having said that, having the browser convey style information to the 
script with the expectation that the script would create the correct 
Range for the composition in progress and apply that style to it, even 
though possible, seems like a lot of hoops to jump through that are 
essentially guaranteed to be exactly the same in every single instance.

I think we can do better. It's a complicated-sounding solution but the 
problem is itself complex, and I *think* that it is doable and the best 
of all options I can think of.

To restate the problem:

   • We don't want the browser editing the DOM directly because that 
just creates madness
   • We want to enable any manner of text composition, from a broad 
array of options, while showing the best UI for the user.

These two requirements are at odds because rich, powerful composition 
that is great for the user *has* to rely on the browser, but the logical 
way for the browser to expose that is to use the DOM.

The idea to ally both is to use a "shadow text insertion point". 
Basically, it is a small DOM tree injected as a shadow at the insertion 
point (with author styles applied to it). The browser can do *anything* 
it wants in there in order to create a correct editing UI. While 
composition is ongoing, the script still receives composition events but 
can safely just ignore them for the vast majority of cases (since you 
can't generally usefully validate composition in progress anyway). When 
the composition terminates, the input event contains the *text* content 
of the shadow DOM, which is reclaimed.

I guess that the shadow text insertion point would participate in the 
tree in the same way that a pseudo-element does. (Yes, I realise this 
basically means "magic".)

I believe this works well for the insertion of new text; I need to mull 
it over further to think about editing existing content (notably the 
case that happens in autocorrect, predictive, and I believe Kotoeri 
where you place a cursor mid-word and it will take into account what's 
before it but not after). But I think it's worth giving it some thought; 
particularly because I don't see how we can solve this problem properly 
otherwise.

This has the advantage that it is also a lot simpler to handle for authors.

-- 
Robin Berjon - http://berjon.com/ - @robinberjon
Received on Monday, 23 June 2014 15:45:45 UTC