Re: FW: proposal: add input/keyboard locale to text and keyboard events

Before I answer specific objections, I would like to preface with a few
notes:

1. In some contexts (e.g. writing code), smart quotes are indeed an unwanted
nuisance. However, they are wanted by most users in most contexts, and word
processors use them. The idea is to make it easier for the word processor to
get them to work properly as much as possible. Both smart quotes and
direction-guessing algorithms are by their nature heuristic. Sometimes they
do something beneficial, sometimes they have no effect, and sometimes they
do something unwanted. As long as the first category is a lot bigger than
the last, and the user can easily undo an unwanted result, or better yet
turn off the whole mechanism generally when it is unwanted, or tune it, they
are beneficial overall. They do not have to win in every case to be useful.

2. Microsoft Word, the most widespread word processor, uses keyboard
language in both of the mechanisms described. (Well, for text direction, it
uses keyboard language for inline direction control. It defaults the
paragraph direction to that of the preceding paragraph, without using the
keyboard language or the direction of the first characters in the
paragraph.) I do not know the exact algorithm that Word uses, but it most
definitely includes checking the keyboard language. Try it.

3. Most importantly, *the proposal here is not for smart quotes or text
direction guessing*, whether using the precise logic given or some other
mechanism. These were just possible use cases - perhaps not even the best
ones. The proposal is for making available some very well-defined
information, which applications can use for whatever purpose they want. It
is a fact that the information can be very useful. Currently, it is simply
unavailable to the web-based application.

> Based on personal experience, I highly doubt that many users switch
keyboards between languages using the same script.
> The chance to get the smart quotes wrong in multilingual documents is
therefore very serious.

True. However:

- There are plenty of bilingual users whose languages use different scripts
(e.g. Latin and Cyrillic/Arabic/Hebrew/Greek/...). For them, the algorithm
as given is a big win.

- For the users who are monolingual, or 1.5-lingual (write only in one
language), the given algorithm is no worse - and probably significantly
better - than looking at the system language or user interface language. (If
a monolingual user has the wrong keyboard in effect, he or she has big
problems simply typing text - smart quotes are not the top concern.) So, for
them - and they are probably the vast majority - the algorithm is a win too.

- The bilingual users whose languages do use the same script (e.g. French
and English) will indeed have a problem with the language they use less
frequently. They may have to go ahead and tell the word processor what
language their document is in - I don't have a problem with that. The word
processor can also implement much more advanced heuristics, e.g. analyze the
text already entered and try to guess the document language that way - but
knowing the keyboard language would still be useful even in this case when
the document is still empty or nearly empty. Do you have a better proposal
for them?

> [And I really don't want to have to switch to a "programming language"
keyboard (never yet heard about such a thing) just in order to avoid "smart
quotes".]

As indicated above, it is highly recommended for the word processor to offer
an option to turn off smart quotes.

> What about using the first strong letter as an indicator of paragraph
direction?

That is also a possibility, but when the paragraph has to start with a
number or with punctuation (e.g. quotation mark, parenthesis), it means that
the paragraph would flip after those have been typed - not a good user
experience if it can be avoided.

Aharon

On Thu, Jul 15, 2010 at 11:08 AM, "Martin J. Dürst"
<duerst@it.aoyama.ac.jp>wrote:

> I'm rather skeptical about this proposal. See below.
>
>
> On 2010/07/15 1:38, Richard Ishida wrote:
>
>> For your consideration. Forwarded with permission.
>>
>> RI
>>
>
>  From: Aharon (Vladimir) Lanin [mailto:aharon@google.com]
>> Sent: 04 July 2010 11:10
>> To: www-dom@w3.org; Doug Schepers
>> Subject: proposal: add input/keyboard locale to text and keyboard events
>>
>>
>>
>> Pierre Cadieux, Hirinori Bono and I would like to propose the following
>> i18n addition to the TextEvent and KeyboardEvent DOM3 interfaces:
>>
>> A new property indicating the locale of the keyboard or other input device
>> using which the input was generated.
>>
>> In both event types, this should be a string (e.g. "en-US"), and can be
>> null when unknown (e.g. when the input method is paste). In the TextEvent
>> interface, the new property could be called inputLocale. In the
>> KeyboardEvent interface, it could be called keyboardLocale or perhaps just
>> locale.
>>
>> Use case 1: smart quotes in script-based online editor / word processor.
>>
>> Different languages use different opening and closing quotation mark
>> characters (e.g. U+201C (“) and U+201D (”) in English, U+00AB («) and U+00BB
>> (») in French, and U+201E („) and U+201C (“) in German), but their standard
>> keyboards rarely allocate keys for them. Thus, word processors typically
>> implement a "smart quote" feature that replaces ASCII quote characters typed
>> by the user with the opening and closing quotation marks appropriate for the
>> document language. Since users rarely want to bother setting a document
>> language, and can use different languages in a single document, word
>> processor programs often the keyboard language as the basis for deciding
>> which quotation marks to use. This is currently not possible for online
>> editors, since they do not know which keyboard is being used to generate
>> input.
>>
>
> Based on personal experience, I highly doubt that many users switch
> keyboards between languages using the same script. Keyboards for different
> languages not only have some base letters switched (e.g. Y and Z for German,
> QW and AZ for French,...), much worse, they have all kinds of symbols in
> different locations, which is difficult to remember and even much more
> difficult to use efficiently at typing speed. The chance to get the smart
> quotes wrong in multilingual documents is therefore very serious.
>
> [And I really don't want to have to switch to a "programming language"
> keyboard (never yet heard about such a thing) just in order to avoid "smart
> quotes".]
>
>
>
>  Use case 2: text direction in script-based online editor / word processor.
>>
>> Hebrew, Arabic, Farsi, and other languages are written right-to-left.
>> However, it is common for right-to-left documents to contain some
>> left-to-right words (e.g. acronyms like "HTML"), phrases (like the title of
>> a foreign-language article), and paragraphs (like an extended quote from a
>> foreign-language article). Although the Unicode Bidi Algorithm provides a
>> default way to display a mixture of left-to-right and right-to-left text
>> without explicit indication of a paragraph's direction, or where the text
>> direction inside a paragraph changes or reverts, this is heuristic and often
>> insufficient to display bidi text as intended. Full-featured word processors
>> (online and otherwise) therefore typically allow the user to indicate the
>> paragraph direction using a UI control. However, the user experience of
>> having to both click on the direction control and change the keyboard
>> language when needing to switch between different-direction languages is
>> problematic, since users very often
>>
> forget to do one or the other. It is therefore desirable for a word
> processor to provide a reasonable default for paragraph direction based on
> the direction of the first character entered by the user. To allow the same
> approach for paragraphs that begin with numbers, neutral characters, and
> whitespace, however, what one really wants is an indication of the language
> of the keyboard (or other input method) used to generate the first
> character. The same technique is even more useful for direction changes
> inside a paragraph, since word processors rarely provide an explicit means
> of indicating them, and they often need to begin with a number or end in
> punctuation (e.g. closing a parenthetical expression begun in the middle of
> the opposite-direction phrase) - examples of cases where the Unicode Bidi
> Algorithm does not do a good enough job.
>
> What about using the first strong letter as an indicator of paragraph
> direction? Or are there really cases where the user, on purpose, is typing a
> number with a Hebrew or Arabic keyboard, and then switches to a Latin
> keyboard, and expects the overall direction to be RTL?
> (I very much understand that there are cases where the user expects the
> overall direction to be RTL even if the first strong letter is LTR, but I
> would claim that in these cases, the keyboard selection for the number
> before the strong letter is probably mainly random, and therefore not useful
> in the above scenario.)
>
> Given that the above scenarios are rather marginal and may lead to mistakes
> and misunderstandings as often as (or more often than) a correct result,
> with the arguments given above I don't think this proposal makes much sense.
>
> Regards,    Martin.
>
> --
> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
>
>

Received on Thursday, 15 July 2010 13:23:17 UTC