W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2002

Re: UK Government Web Guidelines

From: David Woolley <david@djwhome.demon.co.uk>
Date: Sat, 2 Feb 2002 12:23:08 +0000 (GMT)
Message-Id: <200202021223.g12CN8Z21126@djwhome.demon.co.uk>
To: w3c-wai-ig@w3.org
> 
> This is from the guidelines section 2.73.

This is a section specifically on foreign language alternative material,
and, in particular, minority languages amongst immigrant communities.
I will respond to the rest of this in a UK context.

> 	A bilingual anchor page (in effect a table of contents) should index all

What they completely fail to mention, probably because there is general 
ignorance of it in the web design community, is language negotiation.

In the case of Welsh, and probably in the case of any other ISO Latin/1
character set language, I believe that a UK government web site with
alternative language material should offer the translated material 
unprompted if the user has specified that language as more favoured than
English and the material is a full translation, not just a summary. 
This can be done in full on Apache.  I don't think IIS allows one to 
score the quality of the document, so it might not be advisable to offer
a summary, as one couldn't distinguish between users who prefer Welsh,
but accept English and those who want Welsh only, so would give the former
the poor version of the document.

I don't see why this should not also be done for other languages.

(I think Google does Welsh, but I'd be surprised if many Welsh have 
CY as their preferred language;  this is a chicken and egg problem -
authors won't use negotiation if users don't and users will only become
aware of it if authors use it.)

> your translated documents and hyperlink directly to each in the translated
> language and in English.
> 	a PDF version.

I think there is flaw in their thinking here.  I think they are treating
PDF as though it is not the same as an image.  At least for the most recent
phone number reorganisations, the Gujarati and Urdu versions of the booklets
(an industry consortium rather than the government) were Word documents 
consisting of scanned images of the non-latin text.  I have a feeling that
the Chinese version was like this as well.  Word allows embedded fonts.

I suspect they are thinking in terms of PDF as an image of the document, 
and not as something from which one can copy the underlying text.  For
Indic languages, even if the font were embedded, it would be a glyph, not
character based font.  An embedded PDF font, would allow scaling of the
document, for those with poor eyesight, although an over-resolution bit
map would also allow that - and it does not have to be much overresolution,
as the poor eyesight would destroy the excess resolution.

> 	The use of text in GIF It is important that when text cannot be published
> in standard HTML formats that it be made available as formats should be
> avoided. This would not be available if the browser had the graphics was

Although the original is somewhat gibberished, this misses out parts of the
text.  This is the original (with spaces deleted by Word and the PostScript
driver re-inserted, as I'm using ghostscript, not Acrobat):

    (bullet mark) It is important that when text cannot be published in
    standard HTML formats that it be made available as a PDF version.

    (bullet mark) The use of text in GIF formats should be avoided.
    This would not be available if the browser had the graphics was
    turned off, the exception for GIFs being text navigation buttons in
    scripts, such as Bengali.  The "alt" attribute should be in English
    and identify the language being used.

This makes the confusion between PDFs of an image and direct images.

However, it raises some interesting questions.  I think it is likely that
the browsers for the target audiences for these translations will not
be their own browsers, but either public service browsers, or those of
their children, who may be much more comfortable in English than their
language, and may not have any language support that is available installed.

Modern HTML (i.e. the last 4 years) is capable of handling even Indic
languages properly, even if browsers cannot, so there is an argument for
putting the alt text in the actual language.  This requires that the 
people who set the text to graphics also understand Unicode, which is
probably not true.

However, if the text is just the language name, a younger relative can
probably recognize the graphic version.

If the text is more than the language name, I would suggest that using
a Romanised version of the actual language is better than English in 
almost all cases - even a library staff member's broken pronunciation may 
be easier to understand than English, and I suspect the English version
is intended as an aid to library, etc., staff.

I don't know about text to speech in Indic languages, although I suspect
it doesn't really exist (text to speech in Chinese does exist), and the
only browser that might handle Indic languages in visual text is IE6,
but only if a font that requires a licence for MS Word or Publisher
(Arial Unicode) is installed (see www-styles thread about Punjabi
not being displayed on tool tips).  Consequently, I would say that,
although there may be a case for having both the romanised and normal
script versions, I don't think it is reasonable to just have the normal
script, at the moment.

The only justification for English would seem to be for the benefit of
public servants assisting the person from the real audience.  Maybe
they really need two language index pages:

- a self help page (and this is a case where an icon might be advisable
as a link to that page);

- a page for English speaking librarians and social service staff to
find documents on behalf of other people.

> 	Navigation on these pages should be in both English and the translated
> language.

I think that this is for the benefit of people outside the community who
are providing assistance in using the browser.  On the other hand, if
that help is needed, the page design is probably bad even in English as
it implies that someone whose first language is English would need
help as well.

> /webguidelines/fr-index.htm (index to French language anchor page) or
> /webguidlelines/urdu-index.htm (index to Urdu language anchor page).

This will drop out automatically, although in a slightly different form,
if they use language negotiation.

I had a few other points on parts of this document that weren't quoted,
but the only one probably worth pointing out here is that there is
no point in using meta http-equiv to set a character set if the server
is explicitly setting one.  (There is a Japanese/Oriental shopping centre
in North West London which has charset=shift-jis in the meta element on its
web page, but has charset=iso-8859-1 in the HTTP headers, so I get
gibberish rather than unknown character boxes.)
Received on Saturday, 2 February 2002 07:23:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 19 July 2011 18:14:00 GMT