- From: Ian Jacobs <ij@w3.org>
- Date: Sat, 11 Nov 2000 22:37:53 -0500
- To: w3c-wai-ua@w3.org, w3c-i18n-ig@w3.org, nakane@w3.org
Hello, Martin and I just reviewed (while he visits my home) his internationalization (I18N) comments on the last call draft of the User Agent Accessibility Guidelines (UAAG) 1.0: http://www.w3.org/TR/2000/WD-UAAG10-20001023/ These are the raw notes from that discussion. The points raised by Martin here should be considered last call comments. In these notes, we have not yet labeled issues to be formally addressed by the UA Working Group. People are encouraged to comment on these notes, in particular to add either I18N or accessibility perspectives. In particular, Max Nakane is invited to provide further information about these issues in the context of Japanese. Thank you, - Ian ============================================= Charset information ============================================= IJ: If it affects everyone equally, then we have a convention not to include such requirements in such a document. MD: In a US context, this isn't much of a problem. It's helpful to point out to user agent developers, however, that this is something that they should do. MD: Here's why there's an impact on ATs: - The DOM representation is always UTF-16. - Therefore, the UA that constructs the DOM must do the right thing in order for the AT to have access to characters, not just bytes. IJ: But if it's a DOM conformance issue, then we distribute that to the DOM by virtue of requiring conformance to it. /* On APIs */ MD: Each API probably has its own specific character encoding. For instance, in Japan, there are three widely-used encodings. When the UA passes information through an API, it must ensure that the information is converted per the encoding rules of that API. [The DOM is a special case of this.] IJ: I see, so even if the DOM case is covered by the conformance requirement, we do make more general API requirements where we would not be "covered". Actions: - Talk to Masafumi about what happens when a UA doesn't get the proper character encoding information. Possible solutions: - Add a checkpoint - Add examples / explanation in a Note after checkpoint (Guidelines 1, 5?) - Add to definition of API - Add to techniques - In G5 prose, to "using standard communication channels for this exchange", add information about using the appropriate character encoding for these channels. ============================================= 4.12 Speech rate ============================================= MD: Please do not specify the upper and lower bounds in terms of English language! MD: Plus, the concept of "word" is hard to define for, say, Japanese. (It may be easier to talk about syllables...) IJ Proposes: - Make the range an example for English. - Suggest that upper and lower bounds vary according according to language. ============================================= Checkpoint 2.7 ============================================= MD: It's unclear what the boundaries of "recognized but unsupported" are: 1) Doesn't understand lang="fr" 2) Doesn't understand "fr" (or something like "qx", or a string that isn't bound to a language code). The set of language codes grows! 3) Understands "fr" but doesn't have a French dictionary (for speech). MD: If you know what "fr" means but don't have resources to handle it (e.g., a dictionary), that's one thing (e.g., you can ask the user if using the italian dictionary you do have is ok). If the user agent doesn't know how to translate the code (e.g., "zh") to a language name that most users will be familiar with (and most users will not be familiar with language codes), then the UA won't help a user decide whether the information (e.g., in Chinese) is useful at all. IJ: Note that in case where UA hands off rendering to a speech synthesizer, then "recognize" could just mean "pass it off to some appropriate sp. synthesizer". MD: Another issue: if the purpose of 2.7 is to avoid confusion, then there's little difference between perfectly rendered Chinese to someone who doesn't understand Chinese, and garbage sounds or glyphs. "It's all Greek to me." IJ: While there may not be a difference, I'm more confident saying that we are pretty sure that doing something one way will cause confusion, but I can't guarantee that in another case doing something right won't cause confusion. MD: This actually points back to the charset issue again (to ensure that you don't get garbage). ============================================= "Natural language" ============================================= MD: For text-to-speech and text-to-braille, it's important that you know what language it is. For graphical rendering of text, the language of the text is only of minor importance. What counts in the latter case is the "script". IJ: We've avoided the term "script" in this context to avoid confusion with "scripting languages". MD: So 2.7 needs adjusting because you want to talk about speech (natural language) as well as graphical rendering (script). MD: "case" (e.g., uppercase) is a special case. If you translate to braille, you lose the distinction between katakana and hiragana. Action MD: - Verify this with Max Nakane. MD: For 7.5, I think that "script" is a more appropriate term. Action: - Ask Glen Adams what to use. IJ: We could put two entries in for "script". But I need a good definition of script. MD: Check out Unicode. ============================================= Configurations based on natural language ============================================= MD: Should say that the user agent should allow configuration for script/natural language. Suppose you have a speech engine for fr and one for en. The user will want to specify different speech rates and thus needs sufficient configuration. MD: For braille text, a user may want to have shortened braille for English and long braille for German. MD: For graphical display, you can imagine that people with visual disabilities will be more apt at guessing English at smaller sizes.. MD: For Japanese, for example, many people can figure out the text at 9-10 pixels. But this is inadequate to represent the characters. Visually impaired users can probably, with a language they are familiar with, can put up with more fuzziness. IJ: How do you prevent combinatorial explosions of configurations (i.e., for every setting, N languages)? MD: If the language information is provided in the source document, and if you use CSS2, such configurations are easy to do. IJ: I would limit these requirements to text information. For instance, I don't think it's important to be able to say "I want the background to be blue when I'm viewing a French document". MD: The most important info: - Text size - Speech rate. - [For braille, contracted or expanded style.] ============================================= Language information and heuristics ============================================= MD: In most documents, despite WCAG 1.0, language information is not explicit. IJ: This is a repair issue. Our document doesn't have many repair requirements. MD: You may hand off content to a speech synthesizer saying that it's English, and the speech synthesizer may give it back and say "no, I don't think this is English". This works for graphical rendering on a script level, but doesn't work on a language level because the graphical rendering doesn't have a dictionary. If the user knows French and English and has a French and English speech synthesizer installed. If the document says lang="fr", the UA will pick the French speech synthesizer. Most documents will not say whether they are French or English. One speech synthesizer will do a much better job of rendering. It would be convenient (and probably not very difficult) to switch sp. synthesizers automatically, without prompting the user. IJ: I think that this is a convenience functionality: if the user can select dictionaries manually, then they have access. So clearly not impossible (thus not P1). IJ formulating MD's requirement. Three cases: - Author has specified natural language correctly. - Author has specified natural language incorrectly. - Author has not specified natural language. In latter two cases, user must be able to select from among available dictionaries (e.g., to override choices by the author or user agent). MD: In most cases, it's not necessary to specify natural language information. MD: - Authors should identify the natural language of content. - User agents must allow users to select natural language (e.g., dictionary, different engine0. - User agents may do automatic natural language selection when authors haven't specified. ============================================= 4.2 Font family ============================================= MD: Suppose you want to say "Use Helvetica". This is problematic for multilingual documents. I think you have to be careful with the wording. MD: Both NN and IE do the right thing. You don't want 4.2 to mean "Choose Helvetica, and if Helvetica doesn't work give up." MD: You should be able to say "Sans serif" and get something reasonable. MD: "the font family of *all* text". You need to qualify "all". Not everything *has* to be in helvetica; only really what the user expects to be in helvetica (e.g., English text). Proposed: - Clarify that the UA can choose another font family if it can't find a way to represent a particular character. - Clarify that the user must be able to select a preferred font family but that some content in a given document may not be rendered using that font family. ============================================= 7.5 Search ============================================= MD: In English, you assume phonetic spelling. You assume that people search by inputting the spelling. People who listen to text-to-speech may not know the spelling. People who are used to contracted braille may not know the extended version. People in Japan might know the pronunciation of text, but not know which Kanji to use. If you use the "plain text" based search interface, this might not allow some people to find some information. I think this is very much a disability issue. MD: You might want to search on a transformed version. IJ: Our limits are: - Search for text characters in source - Search only on those that have been rendered (or will be, through a viewport) MD: In Japanese, text is rendered graphically as Kanji. Kanji can be read in different ways, and different ones the same way. If someone knows shorthand braille, they want to search using the shorthand. IJ: Our checkopint doesn't talk about input methods - we only talk about the search functionality. How the user gets shorthand braille into the machine is not covered by 7.5. IJ: I think the issue you raise is the same for other input situations such as form controls. I also think that this issue may be larger than for users with disabilities: e.g. how Japanese users input Kanji. ============================================= 9.2 Input configuration ============================================= MD: For Japanese and Chinese, input happens in two stages: the keys you press, then there's the "IME" (input method editor). With some user interaction, this combination produces the final text. IJ: From 1.2: "When available, developers should use APIs at a higher level of abstraction than the standard device APIs for the operating system." MD: - Add a technique to 1.2 about higher level APIs (after IME) for character input. IJ: What about single-key bindings (checkpoint 9.5) in Japanese? MD: Similar situation to form controls: some keys will not be available since they have a particular function. IJ: We cover different "modes" in 9.5. MD: Alt-Shift may be used for switching input methods, for example. IJ: That's covered by checkpoint 9.2 MD: There are arguments for deleting "accessibility" (and leaving "conventions") in 9.2. For example, Alt-tab to switch viewports is not clearly for accessibility, but should not be interfered with. ============================================= Conformance ============================================= MD: "Must implement the standard API for the keyboard". MD: There may be two APIs : - Raw keys - Final text after IME. MD: We had DOM/I18N brainstorming on this issue. Minutes available: http://www.w3.org/DOM/Group/meetings/m19991201.html -- Ian Jacobs (jacobs@w3.org) http://www.w3.org/People/Jacobs Tel: +1 831 457-2842 Cell: +1 917 450-8783
Received on Saturday, 11 November 2000 22:38:00 UTC