Re: Composition, IME, etc. (was: contentEditable=minimal) from Ryosuke Niwa on 2014-06-06 (public-webapps@w3.org from April to June 2014)

From: Ryosuke Niwa <rniwa@apple.com>
Date: Fri, 06 Jun 2014 10:37:12 -0700
To: Robin Berjon <robin@w3.org>
Cc: "public-webapps@w3.org" <public-webapps@w3.org>
Message-id: <37F479BD-730A-4BAC-A4E4-0A3EE34A7C3D@apple.com>
On Jun 6, 2014, at 10:13 AM, Ryosuke Niwa <rniwa@apple.com> wrote:

> 
> On Jun 6, 2014, at 7:24 AM, Robin Berjon <robin@w3.org> wrote:
> 
>> On 05/06/2014 09:09 , Ryosuke Niwa wrote:
>>> On May 23, 2014, at 1:37 PM, Robin Berjon <robin@w3.org> wrote:
>>>> Semantically, autocorrect and compositing really are the same
>>>> thing.
>>> 
>>> They are not.  Word substations and input method compositions are
>>> semantically different operations.
>> 
>> Ok, I'll accept that depending on the level of abstraction at which you're looking at the problem they may or may not be the same thing.
>> 
>> The core of the problem is this: there is a wide array of situations in which some form of "indirect text input" (deliberately going for a new term with no baggage) takes place. This includes (but is not limited to):
>> 
>> • dead key composition (Alt-N, N -> ñ)
>> • assumed international composition (',e -> é, if you just want an apostrophe you have to compose ',space)
>> • inline composition for pretty much everything
>> • popup composition
>> • autocorrect
>> • speed-typing input (T9, swiping inputs)
>> 
>> In order to handle them you have two basic options:
>> 
>> a) Let the browser handle them for you (possibly calling up some platform functionality). This works as closely to user expectations as a Web app can hope to get but how do you render it? If it touches your DOM then you lose the indirection you need for sensible editing; if it doesn't I don't know how you show it.
>> 
>> b) Provide the app with enough information to do the right thing. This gives you the indirection, but "doing the right thing" can be pretty hard.
>> 
>> I am still leaning towards (b) being the approach to follow, but I'll admit that that's mostly because I can't see how to make (a) actually work. If (b) is the way, then we need to make sure that it's not so hard that everyone gets it wrong as soon as the input is anything other than basic English.
> 
> I'm not convinced b is the right approach.
> 
>>>> Note that if there is a degree of refinement such that we may want
>>>> to make it possible for authors to style compositing-for-characters
>>>> and compositing-for-autocorrect, then that ought to go into the
>>>> styling system.
>>> 
>>> In older versions of Windows, for example, the browser itself can't
>>> figure out what kind of style is used by IME.  Korean and Japanese
>>> IME on Windows, for example, use bolded lines and dotted lines for
>>> opposite purposes.  And we get bug reports saying that WebKit's
>>> rendering for Korean IME is incorrect because we decided to follow
>>> Japanese IME's convention.
>> 
>> Right. In this case we need to distinguish between the browser not knowing and the Web app not knowing.
>> 
>> If the browser doesn't know because the platform can't tell the difference between Korean and Japanese (a problem with which Unicode doesn't help) then there really isn't much that we can do to help the Web app.
> 
> This predicates on using approach b.  I'm not convinced that that's the right thing to do here.
> 
>> However if the browser knows, it can provide the app with information. I don't have enough expertise to know how much information it needs to convey — if it's mostly style that can be done (it might be unwieldy to handle but we can look at it).
> 
> The problem here is that we don't know if underlining is the only difference input methods ever need.  We could imagine future new UI paradigms would require other styling such as bolding text, enlarging the text for easier readability while typing, etc... 
> 
>>>> We /could/ consider adding a field to compositing events that would
>>>> capture some form of ontology of input systems. But I think that's
>>>> sort of far-fetched and we can get by with the above. (And yes, I'm
>>>> using "ontology" on purpose. It wouldn't look good :)
>>> 
>>> In my opinion, it's a requirement that input methods work and look
>>> native on editors that use this new API.  IME is not a nice-to-have
>>> feature.  It's a feature required for billions of people to type any
>>> text.
>> 
>> That is *exactly* my point. At this point I believe that if we just added something like a compositionType = deadkey | kr | jp | t9 | autocorrect | ... field and leave it at that we're not helping anyone. The script will need to know not just how to render all of these but how they are supposed to look on each platform. That's why I am arguing for primitives that enable the script to do the right thing *without* having to know everything about all the possible IMEs.
> 
> Right.  We need a primitive to support all without having to explicitly support each.
> 
>> Having said that, I was initially hoping that a mixture of composition events plus IME API would cover a lot of ground already. Thinking about it some more, it's not enough.
>> 
>> Can you help me come up with a list of aspects that need to be captured in order to enable the app to render the right UI? Or do you have another proposal?
> 
> The biggest difference between European alphabet substation (e.g. e -> é) and CJK input methods (e.g. jintian -> 今天) is that the former typically works on a single character at a time (i.e. composes a single letter) while the latter typically works on a few dozen or hundreds of characters at a time.  Furthermore, CJK input methods require more context than just what's being typed for heuristics and dictionary lookup; this is typically provided by the editor itself (usually implicitly) in native applications.  Without input methods wouldn't work.  On top of these semantic differences, CJK input methods typically extra space for candidate windows [1] of various sizes on almost all desktop platforms.  An editor, therefore, needs to avoid showing any "essential" UI beneath the candidate window.  Alternatively, it needs to inform UA where any essential UI resides and let UA avoid showing candidate window and other native UI on top of it.
> 
> Autocorrection is a completely different beast.  Not only autocorrections happen during typing, it could happen any other time.  In fact, the user could tap on any arbitrary text in the editor to bring up an autocorrection UI on Mac & iOS.  A similar feature is available on Android as far as I could recall.  To allow this to work, the UA needs to be aware of "editable" text and then let the editor script know that the user is about to do an autocorrection.

Thinking more carefully, the requirements for CJK input methods and autocorrections are quite similar because many CJK input methods allow "reconverting" already composed text anywhere in the editor and autocorrection also requires the contextual text.

- R. Niwa
Received on Friday, 6 June 2014 17:38:04 UTC