20 June 2002 - WCAG WG Teleconference Minutes

Present

Loretta Guarino Reid
Eugenia Slaydon
Jonathan O'Donnell
Avi
Wendy Chisolm
Jason White
Cynthia Shelley
Gregg Vanderheiden
Lisa Seeman
Ben Caldwell
Paul Bowman
Matt May
Ben

Regrets

Andi Snow-Weaver
Chris O'Kennon
Lee Roberts
Doyle Burnett

Action Items

WC: Check with Internationalization Group about appropriate character encoding to require for Checkpoint 1.5
LS: Provide examples of Hebrew that require diacritic marks to satisfy Checkpoint 1.5
CS, LS: Find out about accuracy of Hebrew disambiguation tools.

Checkpoint 1.5

JW: Which character set - Unicode or W3C character model is the most probable choice.
GV: the W3C Character model, which is based on Unicode
Action item: Wendy to check with Internationalization Group.
WC: First issue - the phrasing of success criteria 1 at the minimal level seems awkward; will everyone know what it means to map to a character encoding?
GW: maybe just say "follows" instead of "maps back to"
CS: included in character set, not mapped to character set
LS: there were 2 things that should go here or to 4.3. The first is vowels for bidirectional languages or any language that can omit vowels. Screen readers mispronounce such words.
GV: If the vowels are needed to unambiguously decode the word, they are covered by the requirement
LS: I didn't read this into the statement.
GV; "To unambiguously decode the words", not the letters. So if removing the vowels makes the word ambiguous, they would be required.
LS: What level would that be? level 1 just requires character mapping
GV: level 2
LS: I don't remember seeing it there.
GV: Want to leave room in case someone can come up with an algorithm that would make it possible to recognize words unambiguously.
LS: Maybe we need this explanation as a footnote
GV: or an example.
GV: if you take the vowels out of words, you are abbreviating them.
GV: It is missing from the success criteria. Can someone suggest how to capture it? If there is a contraction which is ambiguous, it must be marked so it isn't ambiguous.
LS: But it isn't something that people think of as an abbreviation
GV: It's a contraction. "Don't" is a contraction, but not an abbreviation
JO: This is the way you would naturally write the word.
GV: The vowels are written by having marks. It isn't a contraction in that it isn't removing letters, but removing marks.
LS: I'm afraid people won't think of this as being contractions
CS: It would be like leaving off accent marks.
GV: Level 2 checkpoint: "If words are ambiguous because of diacritic marks .."
JO: "If pronounciation is ambiguous..."
GV: But "saw" is pronounced differently in Boston than in California.
GV: Diacritic marks necessary for unambiguous interpretation ...
WC: Since people won't know what diacritic means, should we use the definition instead?
A?: "Include any marks or symbols needed to represent the word unambiguously."
GV: Any symbols such as diacritic marks necessary for interpreting a word unambiguously are present or another standard mechanism for disambiguation is provided.
CS: Put "diacritic marks" in the glossary
GV: We should include some Hebrew in the examples
??: How about resume and resume
WC: "Please resume revising my resume"
LS: There are all sorts of different reasons this is important. Screen reader is one case. Another is for cognition. If you don't have a visual memory and are a weak reader, you will have a hard time sounding out the words.
GV: If you aren't using assistive technologies, you would have to run all pages through a transcoder.
LS: I wouldn't actually know there was another possible interpretation.
CS: It seems like there should be the technology available to do the disambiguation. Perhaps in the translation community.
GV: That is something we should look into, to see whether translation technology can make this checkpoint moot. But as it stands, in that case the checkpoint would always be met.
LS: In the advantages we should include screen readers and human readers with some cognitive disabilities.
CS: No visual memory, or just learning a language. e.g., in the Russian books I've been studying, those for Americans and small children have the diacritic marks.
LS: Why is this at level 2?
GV: Trying to avoid level 1 items that can't be met by anyone who tries. If every page in Israel has to be changed to add vowel marks, it is likely that the guidelines will be rejected completely.
LS: Most of the pages in Israel don't include the vowel marks. They all need to be changed to make them accessible. I hadn't known that we were using this criteria for setting levels.
CS: In Russian, after grade 2 the marks are taken off and it is considered bad writing to use them
LS: Something similar happens with Hebrew.
GV: But it is considered non-standard usage to put them in. It would require a cultural change, and we don't want requirements at level 1 with this kind of impact. We have to be careful with level 1 criteria.
GV: If we have evidence this is not onerous, we could move it up. But that isn't the impression I was getting.
LS: I hadn't been aware we were formalizing that in setting levels.
CS: "Wide applicability" is what this was called.
GV: We tried not to do body counts, that is, level 1 is lots of people and level 2 is not so many people. We also couldn't use strict accessibility, since all level 3 items are required for some people.
LS: I'm very uncomfortable with all the prioritizing. But I think the success criteria is a good one, and I'm happy with the checkpoint. Priority is another discussion.
GV: We wrestle with every item that isn't level 1.
LS: To make Israel accessible, it has to happen.
GV: It would be great to find or fund a disambiguation tool.
LS: The question is what the accuracy would be.

JW: The first criteria at level 1 should be divided into 2 clauses. Either provide the characters in Unicode, or provide the mapping explicitly in a standard way. W3C character model is Unicode with certain restrictions on use; we probably don't want those restrictions on content. For example, there are combining characters in Unicode that can either be represented as a single Unicode or a sequence. The W3C character model says which must be used.
CS: Using Unicode instead of W3C doesn't have an effect on whether text is accessible.
GV: Jason is saying it is unambiguous. It does mean your screen reader would need to support interpreting both representations.
JW: It is possible to do an automatic translation between the representations without any loss.
CS: That might be a particular alphabet's code page mapping, or Unicode, or W3C character model, or... All these
work in the real world and could be handled by assistive technology.
WC: The W3C character model is Unicode.
GV: Jason, you are saying that it should say "text in the content must be in Unicode or an explicit mapping to Unicode must be provided"
JW: If there is an encoding that is a standard that has a mapping to Unicode, that should satisfy.
GV: Unicode is double byte and Ascii is single byte. Would it satisfy? Can you use single-byte characters and call it Unicode?
CS: There are existing mappings.
GV: How will the screen reader know what mapping to use?
CS: The OS usually provides this information.
JW: There is an issue, a potential ambiguity. If the content has to be provided in a particular character set , that is one thing. But if there is mapping, it has to be provided. The content requirement would be that the mapping be there or that the User Agent can do the mapping.
CS: Need a meta-tag for declaring what code page you use. Then the user agent knows what to do with it.
LGR: Need the encoding explicitly identified in the content.
CS: Why is declaring the natural language a level 3? It is trivially easy and solves this problem.
GV: There are sites that are otherwise accessible but don't have that info, and that are archival. Also, there are Word documents that don't necessarily have this. But what does this really help? Usually you can tell what language by looking at the page.
JW: Language and character sets are separate issues. But I can rework success criterion 1.
WC: The Character Model has a lot of this in there. Why don't we just use it?
??: And the Character Model will be required of W3C standards.
CS: There may be situations where it is not appropriate.
GV: But this is a level 1, so we have to be careful we don't wipe out all PDF, for example.
LGR: Are all code pages Unicode subsets?
GV: Microsoft is setting things up to map to Unicode. Because it is a level 1, it worries me.
CS: Why don't we put it in as it currently stands, with an open question to refine this.
GW: Text in the content must be Unicode or automatically mapped back to Unicode. Then define what automatically means. Net effect is screen reader sees Unicode. User Agent can do this, OS can do this , etc.
JW: Would a person writing an authoring tool read that as meaning they need to provide the character map? They should.
GV: If the Reader doesn't already do the mapping.
LGR: It depends on the encoding used for a font. Some standard encodings can be converted to Unicode automatically. For other encodings, a ToUnicode table is required.
JW: Just be sure that the author understands he is responsible to provide the information, whether it is the mapping or the information necessary to generate the mapping automatically. I don't think the clause after the "or" makes it clear that the author needs to provide the necessary information.
GV: "information is provided so that it is automatically mapped."
CS: it will vary by technology and language what that information is. In HTML, if the page is in ASCII, nothing additional needs to be provided.
LS: another question: are we asking for unambiguous decoding or unambiguous meaning? Does meaning belong here?
JW: That is guideline 4.
GV: That's understanding. What the person is trying to convey with the word (as opposed to understanding what word he is using). But have we just ruled out English, e.g., read vs read.
CS: Perhaps we need to put something in about whether the language supports it. English is full of different words that are spelled the same way.
GV: Add the word "standard". "Standard symbols". "Standard usage symbols". Then we need to say what standard usage is. But then Israel will say standard usage is not to use them.
LS: "Commonly understood"? Everyone in Israel knows what those marks mean.
GV: "Common usage" symbols
??: "Common symbols"
GV: "Symbols such as diacritic marks that are commonly used"
CS: adding diacritic marks to distinguish read and read will reduce readability for most people.
LS: "Common" instead of "common usage"? Screen readers are very expensive because they do include the language analysis tools, but they are only 80% accurate and very slow and very expensive. If you want a country like Israel to be accessible, this is a requirement.
GV: Israel could require level 2 on this item. But putting it in level 1 imposes it on every country. Let's see whether language translation work covers this.
JW: "Which is common for the language"
??: "commonly used and necessary"
JW: "commonly used in the language of the content and necessary"
LS: "commonly understood"
??: "that are found in common usage"
??: "found in standard usage"
GV: will update the checkpoint in the document
JW: Anyone not at the meeting who has comments on 2.1 should post them to the mailing list.

$Date: 2002/06/20 21:45:12 $ Loretta Guarino Reid