W3C home > Mailing lists > Public > whatwg@whatwg.org > January 2014

Re: [whatwg] inputmode feedback

From: Ian Hickson <ian@hixie.ch>
Date: Tue, 28 Jan 2014 00:22:53 +0000 (UTC)
To: Takayoshi Kochi (河内 隆仁) <kochi@google.com>
Message-ID: <alpine.DEB.2.00.1401280004470.26647@ps20323.dreamhostps.com>
Cc: WHAT Working Group <whatwg@lists.whatwg.org>, Yoichi Osato <yoichio@google.com>, "Michael\[tm\] Smith" <mike@w3.org>, Ryosuke Niwa <rniwa@apple.com>
On Mon, 27 Jan 2014, Takayoshi Kochi (河内 隆仁) wrote:
> On Wed, Jan 22, 2014 at 9:54 AM, Ian Hickson <ian@hixie.ch> wrote:
> 
> Japanese IME (e.g. Microsoft IME) has several modes for typing 
> characters, divided by the category of the set of characters (kana, 
> half-width-kana, alphanumeric, full-width-latin etc.), each of which has 
> different purpose for representing things (e.g. imported words or 
> computer programs are written in alphabets, while other Japanese text is 
> written in kana and kanji, converted from kana using IME) and switching 
> these submodes are critical part of (at least desktop) IMEs to 
> efficiently type text with several of these sets of characters with 
> alphabet-only physical keyboard. These modes are actually sub-modes of a 
> specific language (Japanese) IME.

This sounds similar to how it works for Latin scripts, where for instance 
you might have one mode for entering numbers and one mode for entering 
text and one mode for entering URLs.


> Take a look at a Japanese blog post e.g. 
> http://googlejapan.blogspot.jp/2014/01/google-chrome.html You can see 
> alphabets, numbers and other Japanese characters (hiragana, katakana, 
> kanji) are all used in one entry.

This again is similar to how in Latin script prose you might find segments 
with URLs, others with tables of numbers, and others with text, all 
combined into one document.


> So for completing the feature parity for Japanese web users against 
> native applications, these modes has importance but we think inputmode 
> is not an appropriate place to put it on.

Given how identical they sound to the latin script features exposed by 
inputmode="", I don't understand why inputmode="" would be the wrong place 
for this. It sounds eminently applicable.


> For one thing, such Japanese IME submodes are dynamic by nature (user 
> can move from one to another with some key combinations or via IME 
> menu). When user move focus from one field to another, the submode is 
> persistent.

This is an implementation detail, right? I mean, if a user wanted to 
create a platform where the IME defaulted back to kana whenever a new 
control was focused, that wouldn't be wrong, it would just be a choice 
they would have to implement, no?


> E.g. You have a form:
> 
> Zip: []   <- alphanumeric
> Address: [] <- kanji (kana)
> Building: [] <- kanji (kana), and maybe numeric for building/room number
> etc.
> Name: [] <- kanji (kana)
> Tel: [] <- number
> 
> Without any inputmode or alike, you start from zip code, you turn off 
> IME (if it's on) and type in zip code, then press tab to the next field, 
> IME is still off, you turn on IME to fill address, type in building and 
> name. At the Tel field, you have to turn off IME (or change IME submode 
> to "half-width-latin") manually, because the IME submode (at this point, 
> most probably in "kana" mode) persists.

Right. This is the kind of thing we're trying to solve.

It's identical to the same problem in Latin script pages: you'd start from 
the ZIP code, switch to "digit" mode, type in the ZIP code, tab to the 
Address field, switch back to text mode, type that in, etc. At the Tel 
field, you'd have to switch back the "digit" mode.

With inputmode="", we avoid this; the ZIP field starts in numeric mode, 
the Tel field starts in a dedicated Telephone mode, the Name field 
starts in the text mode but with automatic capitalisation since names in 
Latin text are usually capitalised, etc.


> If "inputmode" has these Japanese IME submodes (it does), users can save 
> switching modes manually when hopping from one field to another. Usually 
> Japanese users are accustomed to changing these modes manually, so if 
> the mode changes automatically, it may cause a surprise, but use cases 
> such as inputting a lot of entries for address book repetitively, it 
> would save the mode switching much.

Right, that's the idea.


> That said, though inputmode is useful to declaratively write the 
> "default" expected submode for the field to be initially in, once user 
> manually change the mode from the initial submode, it is not apparent 
> how should the browser behave when user re-focus that field? Browser 
> implementer can have several choices here:
> 
> 1. Change the IME mode to what is specified as "inputmode".
> 2. Remember the last mode when the focus was there and restore the mode.
> 3. Do nothing, if the mode of the field manually changed to something else.
> 4. none of the above(?)

Same as with latin script input modes, right.

This is an implementation choice. Personally I would recommend restoring 
whatever input mode was used when the user last had that field focused, 
resetting when the page is reloaded. But you can do things more or less 
clever here, or have it configurable, or whatever you want.

The idea, in HTML, is to provide hints to the user agent so that the user 
agent can use the information about the page ("this is a numeric field", 
"this is for a Latin name", "this is for kanji text written using kana") 
to provide the most helpful interface to the user.


> The current inputmode spec doesn't say anything about this detail, and 
> what is the best choice can be different case-by-case.

If there are hints that could be provided that would help user agents 
implement even better interfaces, that's certainly something we can add. 
In what cases would you want each refocus behaviour you list above?


> Also, inputmode is modifiable from Javascript code, e.g. 
> element.inputmode = "kana" while user is typing something with IME - 
> which can be a disruptive operation.

Yes. In fact the whole control can change size, be removed, be replaced by 
a different control, the script can move the focus around, the page can be 
closed, a video of a flying cat can be displayed instead of the form, any 
number of disruptive things can be done.

The solution there is just for pages not to do these things. And by and 
large they don't, because these things cause users to leave their site.


> This is also another reason that such Japanese IME submodes have better 
> affinity to IME API than HTML inputmode.

I'm not sure I follow this.

I wouldn't want to tell authors to write script to manage an IME API any 
time they have a form on their page. That's far more effort than ideal.

I'm not saying there shouldn't be an IME API for the rare authors who want 
to do something even cleverer, but it seems like the common case should be 
significantly easier.

> > > How about using this bug as a starting point of the discussion 
> > > (although it's on w3c bugzilla)? 
> > > https://www.w3.org/Bugs/Public/show_bug.cgi?id=23961
> >
> > That's a bug on the W3C HTML5 spec, so it isn't one that I'm tracking. 
> > I encourage you to post on this list or to file a new bug if you 
> > prefer to discuss things on a bug system: http://whatwg.org/newbug (it 
> > also uses the W3C Bugzilla, but a different component)
>
> Hmm, then does changing the "Product" from "HTML WG" to "WHATWG" work? 
> (I'm not changing myself anyway - would like to defer this to Ben 
> Bucksh.)

Changing the product is one option, but it's best if only the person who 
filed the bug does it. I would recommend just filing a new bug if you want 
to discuss it on a bug rather than here.


> > The inputmode="" attribute only has one aspect: what does the user 
> > want to enter. This can impact many things, including the script, the 
> > language, the kinds of keys that are visible, the kinds of typing aids 
> > enabled, the source(s) of autocomplete data, and so on. There are many 
> > different platforms that can use inputmode="": a mobile visual device 
> > might use an on-screen keyboard or on-screen handwriting recognition 
> > widget, a desktop computer might use a fully-fleged IME system, a 
> > speech-based system might do everything using speech recognition and 
> > use the input mode to decide what dictionaries to use for recognition 
> > and what scripts to use for transcribing the results.
> >
> > Exposing all these aspects to the author is a losing proposition: 
> > authors would frequently make mistakes, forget certain classes of 
> > users or devices, fail to test on all possible platforms, etc. The 
> > solution is to use the high-level semantic approach used elsewhere by 
> > HTML, and thus just provide a high-level description of the kind of 
> > input that is expected, letting the user agent translate this into the 
> > appropriate settings for the OS-level input system.
>
> Well, then you seem to support that such Japanese IME submodes are 
> low-level and should not belong to inputmode :)

I don't think what I said leads to that conclusion at all, no. Can you 
elaborate on why you think it does?

I think it leads to the opposite conclusion. That we should provide a 
simple markup-level way of labeling the most helpful input modality for a 
control, with the browser making the best choices of UI for the user based 
on this.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 28 January 2014 00:23:18 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 17:00:15 UTC