FW: proposal: add input/keyboard locale to text and keyboard events

[[
To allow others to participate in this discussion, I'm forwarding this to the www-international list.  Further discussion should take place on this list. Please copy www-international and do not copy public-i18n-core on further mail on this thread.

If you want to read the previous discussion on the public-i18n-core list, you can do so in the archive, starting at http://lists.w3.org/Archives/Public/public-i18n-core/2010JulSep/0018.html 
]]



From: public-i18n-core-request@w3.org [mailto:public-i18n-core-request@w3.org] On Behalf Of Aharon (Vladimir) Lanin
Sent: 17 August 2010 01:04
To: www-dom@w3.org; public-i18n-core@w3.org; Doug Schepers
Subject: Re: proposal: add input/keyboard locale to text and keyboard events

After discussions and modifications at the i18n WG teleconferences and in public-i18n-core@, the i18n WG has approved the following updated proposal and is now turning it back over to www-dom@.

Proposal: An additional property in the TextEvent and KeyboardEvent DOM3 interfaces to indicate the locale of the keyboard or other input device using which the input was generated. When this is unknown (e.g. when the input method is paste, or when the implementation can not obtain this information from the underlying platform), the property should be null (or perhaps undefined, whichever conforms to DOM3 conventions better).

Here is a draft for the documentation of the new event attributes, striving to conform in style to http://dev.w3.org/2006/webapi/DOM-Level-3-Events/html/DOM3-Events.html#events-textevents

In TextEvent:

inputLocale of type DOMString, readonly
    A BCP-47 tag indicating the locale for which the origin of the event (whether keyboard, IME, handwriting recognition software, or other input mode) is configured, e.g. "en-US". May be null [undefined?] when inapplicable or unknown, e.g. for pasted text or when this information is not exposed by the underlying platform.

    Note: inputLocale does not necessarily indicate the locale of the data or the context in which it is being entered. For example, a French user often may not switch to an English keyboard when typing English, in which case the inputLocale will still indicate French, even though the data is actually English.

In KeyboardEvent:

inputLocale of type DOMString, readonly
    A BCP-47 tag indicating the locale for which the keyboard used to generate the event is configured, e.g. en-US. May be null [undefined?] when unknown, e.g. when this information is not exposed by the underlying platform.

    Note: inputLocale does not necessarily indicate the locale of the text that the user may be keying in. For example, a French user often may not switch to an English keyboard when typing English, in which case the inputLocale will still indicate French. Nor can it be used to definitively calculate the "physical" or "virtual" key associated with the event, or the character printed on that key.






On Sun, Jul 4, 2010 at 1:09 PM, Aharon (Vladimir) Lanin <aharon@google.com> wrote:
Pierre Cadieux, Hirinori Bono and I would like to propose the following i18n addition to the TextEvent and KeyboardEvent DOM3 interfaces:

A new property indicating the locale of the keyboard or other input device using which the input was generated.

In both event types, this should be a string (e.g. "en-US"), and can be null when unknown (e.g. when the input method is paste). In the TextEvent interface, the new property could be called inputLocale. In the KeyboardEvent interface, it could be called keyboardLocale or perhaps just locale.

Use case 1: smart quotes in script-based online editor / word processor.

Different languages use different opening and closing quotation mark characters (e.g. U+201C (“) and U+201D (”) in English, U+00AB («) and U+00BB (») in French, and U+201E („) and U+201C (“) in German), but their standard keyboards rarely allocate keys for them. Thus, word processors typically implement a "smart quote" feature that replaces ASCII quote characters typed by the user with the opening and closing quotation marks appropriate for the document language. Since users rarely want to bother setting a document language, and can use different languages in a single document, word processor programs often the keyboard language as the basis for deciding which quotation marks to use. This is currently not possible for online editors, since they do not know which keyboard is being used to generate input.

Use case 2: text direction in script-based online editor / word processor.

Hebrew, Arabic, Farsi, and other languages are written right-to-left. However, it is common for right-to-left documents to contain some left-to-right words (e.g. acronyms like "HTML"), phrases (like the title of a foreign-language article), and paragraphs (like an extended quote from a foreign-language article). Although the Unicode Bidi Algorithm provides a default way to display a mixture of left-to-right and right-to-left text without explicit indication of a paragraph's direction, or where the text direction inside a paragraph changes or reverts, this is heuristic and often insufficient to display bidi text as intended. Full-featured word processors (online and otherwise) therefore typically allow the user to indicate the paragraph direction using a UI control. However, the user experience of having to both click on the direction control and change the keyboard language when needing to switch between different-direction languages is problematic, since users very often forget to do one or the other. It is therefore desirable for a word processor to provide a reasonable default for paragraph direction based on the direction of the first character entered by the user. To allow the same approach for paragraphs that begin with numbers, neutral characters, and whitespace, however, what one really wants is an indication of the language of the keyboard (or other input method) used to generate the first character. The same technique is even more useful for direction changes inside a paragraph, since word processors rarely provide an explicit means of indicating them, and they often need to begin with a number or end in punctuation (e.g. closing a parenthetical expression begun in the middle of the opposite-direction phrase) - examples of cases where the Unicode Bidi Algorithm does not do a good enough job.

Aharon Lanin

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.851 / Virus Database: 271.1.1/3076 - Release Date: 08/16/10 19:35:00

Received on Tuesday, 17 August 2010 16:49:20 UTC