- From: David Woolley <david@djwhome.demon.co.uk>
- Date: Sat, 2 Feb 2002 12:23:08 +0000 (GMT)
- To: w3c-wai-ig@w3.org
> > This is from the guidelines section 2.73. This is a section specifically on foreign language alternative material, and, in particular, minority languages amongst immigrant communities. I will respond to the rest of this in a UK context. > · A bilingual anchor page (in effect a table of contents) should index all What they completely fail to mention, probably because there is general ignorance of it in the web design community, is language negotiation. In the case of Welsh, and probably in the case of any other ISO Latin/1 character set language, I believe that a UK government web site with alternative language material should offer the translated material unprompted if the user has specified that language as more favoured than English and the material is a full translation, not just a summary. This can be done in full on Apache. I don't think IIS allows one to score the quality of the document, so it might not be advisable to offer a summary, as one couldn't distinguish between users who prefer Welsh, but accept English and those who want Welsh only, so would give the former the poor version of the document. I don't see why this should not also be done for other languages. (I think Google does Welsh, but I'd be surprised if many Welsh have CY as their preferred language; this is a chicken and egg problem - authors won't use negotiation if users don't and users will only become aware of it if authors use it.) > your translated documents and hyperlink directly to each in the translated > language and in English. > · a PDF version. I think there is flaw in their thinking here. I think they are treating PDF as though it is not the same as an image. At least for the most recent phone number reorganisations, the Gujarati and Urdu versions of the booklets (an industry consortium rather than the government) were Word documents consisting of scanned images of the non-latin text. I have a feeling that the Chinese version was like this as well. Word allows embedded fonts. I suspect they are thinking in terms of PDF as an image of the document, and not as something from which one can copy the underlying text. For Indic languages, even if the font were embedded, it would be a glyph, not character based font. An embedded PDF font, would allow scaling of the document, for those with poor eyesight, although an over-resolution bit map would also allow that - and it does not have to be much overresolution, as the poor eyesight would destroy the excess resolution. > · The use of text in GIF It is important that when text cannot be published > in standard HTML formats that it be made available as formats should be > avoided. This would not be available if the browser had the graphics was Although the original is somewhat gibberished, this misses out parts of the text. This is the original (with spaces deleted by Word and the PostScript driver re-inserted, as I'm using ghostscript, not Acrobat): (bullet mark) It is important that when text cannot be published in standard HTML formats that it be made available as a PDF version. (bullet mark) The use of text in GIF formats should be avoided. This would not be available if the browser had the graphics was turned off, the exception for GIFs being text navigation buttons in scripts, such as Bengali. The "alt" attribute should be in English and identify the language being used. This makes the confusion between PDFs of an image and direct images. However, it raises some interesting questions. I think it is likely that the browsers for the target audiences for these translations will not be their own browsers, but either public service browsers, or those of their children, who may be much more comfortable in English than their language, and may not have any language support that is available installed. Modern HTML (i.e. the last 4 years) is capable of handling even Indic languages properly, even if browsers cannot, so there is an argument for putting the alt text in the actual language. This requires that the people who set the text to graphics also understand Unicode, which is probably not true. However, if the text is just the language name, a younger relative can probably recognize the graphic version. If the text is more than the language name, I would suggest that using a Romanised version of the actual language is better than English in almost all cases - even a library staff member's broken pronunciation may be easier to understand than English, and I suspect the English version is intended as an aid to library, etc., staff. I don't know about text to speech in Indic languages, although I suspect it doesn't really exist (text to speech in Chinese does exist), and the only browser that might handle Indic languages in visual text is IE6, but only if a font that requires a licence for MS Word or Publisher (Arial Unicode) is installed (see www-styles thread about Punjabi not being displayed on tool tips). Consequently, I would say that, although there may be a case for having both the romanised and normal script versions, I don't think it is reasonable to just have the normal script, at the moment. The only justification for English would seem to be for the benefit of public servants assisting the person from the real audience. Maybe they really need two language index pages: - a self help page (and this is a case where an icon might be advisable as a link to that page); - a page for English speaking librarians and social service staff to find documents on behalf of other people. > · Navigation on these pages should be in both English and the translated > language. I think that this is for the benefit of people outside the community who are providing assistance in using the browser. On the other hand, if that help is needed, the page design is probably bad even in English as it implies that someone whose first language is English would need help as well. > /webguidelines/fr-index.htm (index to French language anchor page) or > /webguidlelines/urdu-index.htm (index to Urdu language anchor page). This will drop out automatically, although in a slightly different form, if they use language negotiation. I had a few other points on parts of this document that weren't quoted, but the only one probably worth pointing out here is that there is no point in using meta http-equiv to set a character set if the server is explicitly setting one. (There is a Japanese/Oriental shopping centre in North West London which has charset=shift-jis in the meta element on its web page, but has charset=iso-8859-1 in the HTTP headers, so I get gibberish rather than unknown character boxes.)
Received on Saturday, 2 February 2002 07:23:57 UTC