Re: FW: proposal: add input/keyboard locale to text and keyboard events from Martin J. Dürst on 2010-07-15 (public-i18n-core@w3.org from July to September 2010)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Thu, 15 Jul 2010 17:08:55 +0900
To: Richard Ishida <ishida@w3.org>
CC: public-i18n-core@w3.org
Message-ID: <4C3EC217.9050107@it.aoyama.ac.jp>
I'm rather skeptical about this proposal. See below.

On 2010/07/15 1:38, Richard Ishida wrote:
> For your consideration. Forwarded with permission.
>
> RI

> From: Aharon (Vladimir) Lanin [mailto:aharon@google.com]
> Sent: 04 July 2010 11:10
> To: www-dom@w3.org; Doug Schepers
> Subject: proposal: add input/keyboard locale to text and keyboard events
>
>
>
> Pierre Cadieux, Hirinori Bono and I would like to propose the following i18n addition to the TextEvent and KeyboardEvent DOM3 interfaces:
>
> A new property indicating the locale of the keyboard or other input device using which the input was generated.
>
> In both event types, this should be a string (e.g. "en-US"), and can be null when unknown (e.g. when the input method is paste). In the TextEvent interface, the new property could be called inputLocale. In the KeyboardEvent interface, it could be called keyboardLocale or perhaps just locale.
>
> Use case 1: smart quotes in script-based online editor / word processor.
>
> Different languages use different opening and closing quotation mark characters (e.g. U+201C (“) and U+201D (”) in English, U+00AB («) and U+00BB (») in French, and U+201E („) and U+201C (“) in German), but their standard keyboards rarely allocate keys for them. Thus, word processors typically implement a "smart quote" feature that replaces ASCII quote characters typed by the user with the opening and closing quotation marks appropriate for the document language. Since users rarely want to bother setting a document language, and can use different languages in a single document, word processor programs often the keyboard language as the basis for deciding which quotation marks to use. This is currently not possible for online editors, since they do not know which keyboard is being used to generate input.

Based on personal experience, I highly doubt that many users switch 
keyboards between languages using the same script. Keyboards for 
different languages not only have some base letters switched (e.g. Y and 
Z for German, QW and AZ for French,...), much worse, they have all kinds 
of symbols in different locations, which is difficult to remember and 
even much more difficult to use efficiently at typing speed. The chance 
to get the smart quotes wrong in multilingual documents is therefore 
very serious.

[And I really don't want to have to switch to a "programming language" 
keyboard (never yet heard about such a thing) just in order to avoid 
"smart quotes".]


> Use case 2: text direction in script-based online editor / word processor.
>
> Hebrew, Arabic, Farsi, and other languages are written right-to-left. However, it is common for right-to-left documents to contain some left-to-right words (e.g. acronyms like "HTML"), phrases (like the title of a foreign-language article), and paragraphs (like an extended quote from a foreign-language article). Although the Unicode Bidi Algorithm provides a default way to display a mixture of left-to-right and right-to-left text without explicit indication of a paragraph's direction, or where the text direction inside a paragraph changes or reverts, this is heuristic and often insufficient to display bidi text as intended. Full-featured word processors (online and otherwise) therefore typically allow the user to indicate the paragraph direction using a UI control. However, the user experience of having to both click on the direction control and change the keyboard language when needing to switch between different-direction languages is problematic, since users very often 
forget to do one or the other. It is therefore desirable for a word processor to provide a reasonable default for paragraph direction based on the direction of the first character entered by the user. To allow the same approach for paragraphs that begin with numbers, neutral characters, and whitespace, however, what one really wants is an indication of the language of the keyboard (or other input method) used to generate the first character. The same technique is even more useful for direction changes inside a paragraph, since word processors rarely provide an explicit means of indicating them, and they often need to begin with a number or end in punctuation (e.g. closing a parenthetical expression begun in the middle of the opposite-direction phrase) - examples of cases where the Unicode Bidi Algorithm does not do a good enough job.

What about using the first strong letter as an indicator of paragraph 
direction? Or are there really cases where the user, on purpose, is 
typing a number with a Hebrew or Arabic keyboard, and then switches to a 
Latin keyboard, and expects the overall direction to be RTL?
(I very much understand that there are cases where the user expects the 
overall direction to be RTL even if the first strong letter is LTR, but 
I would claim that in these cases, the keyboard selection for the number 
before the strong letter is probably mainly random, and therefore not 
useful in the above scenario.)

Given that the above scenarios are rather marginal and may lead to 
mistakes and misunderstandings as often as (or more often than) a 
correct result, with the arguments given above I don't think this 
proposal makes much sense.

Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 15 July 2010 08:09:50 UTC