Re: Key Identifiers

below the asterisks.

Since my November e-mail, Safari 5 has been released with, in  
particular, the ability to identify dead keys.  Since dead keys  
previously produced events with no key/character information, one  
might be able to guess that a dead key had been pressed, but it would  
be impossible to distinguish between them.

For keyUp/keyDown, the current situation for character-producing keys  
in Safari is as follows:

1. keyIdentifier

1a) For non-dead keys, keyIdentifier identifies the corresponding  
Unicode character.  This works as expected irrespective of keyboard  
layout.

1b) For dead keys, keyIdentifier is undefined.

In summary, keyIdentifier actually identifies a character rather than  
a key, but does not work for dead keys.


2. keyCode

2a) For non-dead keys, keyCode is confusing and does not really work  
for non-US keyboards.  If the character exists on a US keyboard, the  
keyCode is set to the key one would have to press on a US keyboard to  
get the same character; if the character does not exist on a US  
keyboard, the keyCode is taken from the key that occupies the same  
physical position on a US keyboard.  (The explanation is simplified,  
but the following example is authentic.)  For instance, the keys "/3  
and '/4 on a French keyboard will both give keyCode 222 since both "  
and ' are obtained by pressing the key '/" on a US keyboard, and that  
key has keyCode 222; moreover, the key ù/% on a French keyboard will  
also give keyCode 222 since there is no key producing a ù on a US  
keyboard, and the US '/" key has the same physical position as the  
French ù/% key.  Thus, three distinct keys on a French keyboard all  
share the same keyCode.

2b) For dead keys, the system described in 2a actually works well,  
since there are no dead keys on a US keyboard, so the keyCode will  
always be set in terms of physical position.

In summary, keyCode actually identifies keys, but this only works  
reliably for dead keys on non-US keyboards.


***


The May thread entitled "what should event.key be if a key inserts  
multiple characters?" seems to have concluded that the concepts  
'key' (.key) and 'character' (.char) must be kept separate.  The above  
illustrates how confusing a mix of the two can be, not to mention that  
neither key nor character can be reliably identified for all keys on  
the keyboard.

It should also be pointed out that it is not always possible to derive  
the key pressed from the character produced, even when the keyboard  
layout is known; in particular, the Vietnamese keyboard layout has two  
` keys.

What remains to be done is to define what .key and .char should be for  
different types of keys.  Some comments on this issue below:

1. Keys that do not produce characters
Setting .key to a descriptive name and let .char be undefined/empty  
seems natural and should be uncontroversial.

2. Non-dead keys that produce characters
..char can be set to the corresponding Unicode character or string of  
Unicode characters.  (When an input method editor is being used, these  
characters may differ from the resulting composed character.)
..key must always be the same for a given physical key irrespective of  
keyboard layout, and probably irrespective of modifiers as well.  The  
exact set of values does not really matter, but one should keep in  
mind that there are three main types of physically slightly different  
keyboards with different numbers of keys (JIS, ANSI and ISO).

3. Dead keys
..key as for non-dead keys. (This necessarily follows from the above  
since various keyboard layouts will have different numbers of dead  
keys at different positions, and .key is invariant with respect to  
keyboard layout.)
For .char, the current draft suggests that Unicode combining  
characters be used.  As previously mentioned, this is potentially  
confusing since a dead key combines with the following key whereas a  
combining character combines with the previous character.  More  
importantly, Unicode does not provide combining characters for all  
dead keys; for instance, a Vietnamese OS X keyboard layout has a  
number of dead letter keys including A and Ư which combine with non- 
dead accent keys to form accented letters like Ạ and Ử.  Perhaps it  
would be better to use non-combining Unicode characters for .char and  
add an attribute indicating whether the key is dead or not, a bit like  
a modifier flag.

Øistein E. Andersen

Received on Thursday, 15 July 2010 00:09:01 UTC