Re: I18N review comments on UAAG 1.0 RESEND from Martin Duerst on 2002-09-30 (w3c-wai-ua@w3.org from July to September 2002)

From: Martin Duerst <duerst@w3.org>
Date: Mon, 30 Sep 2002 17:54:18 +0900
To: "Ian B. Jacobs" <ij@w3.org>, ishida@w3.org
Cc: w3c-wai-ua@w3.org, w3c-i18n-ig@w3.org
Message-Id: <4.2.0.58.J.20020930162632.040f7118@localhost>
Hello Ian,

Some more discussion.

At 23:28 02/09/25 -0700, Ian B. Jacobs wrote:


>Richard Ishida wrote:
> > I am resending this note with corrections made to the numbering at the end.
>  > The content is unchanged.
> > RI
> > ============================================
> > Please find enclosed the last call comments from the 
> Internationalization WG
>  > on the User Agent Accessibility Guidelines 1.0 Version reviewed
> > was 21 August 2002
>
>Richard,
>
>Thanks to you and the I18N WG for reviewing the document and sending
>comments. Some replies below.
>
>   - Ian
>
>
> > #i18n-1:
> > Checkpoint 2.10, checkpoint provision 1
> > The heading talks about 'language' whereas the checkpoint
> > provision talks about
> >scripts (ie. Writing systems)'.  Both the title and text should be
> > changed to'language or script', to cover both the visual rendering > 
> case and the text-to-speech (or -to-braille) case.
>
>
>That seems ok to me.
>
>I note that the checkpoint used to include requirements that applied
>to pre-recorded audio, but we removed them. Our primary use case for
>this checkpoint is to turn off the visual rendering of meaningless
>characters.

I think this is quite a bit more important for text-to-speech than
for visual rendering, because of the linearity of speach, and
therefore the time lost having to listen to nonsense.

[this is also clearly expressed in the following note already in
the document:
Note: This checkpoint is designed primarily to benefit users with serial 
access to content or who navigate sequentially, allowing them to skip 
portions of content that would be unusable if rendered as "garbage".
]


> > #i18n-2:
> > Checkpoint 2.10, checkpoint provision 1
> > Is it clear enough how one would know that text is in an 'unsupported 
> script' or language?  Whether or not something can be rendered would 
> presumably depend
> >on the capabilities of the application in a given modality, eg.
> > font availability in a visual modality (without necessarily a
> > requirement to understand the underlying semantics if this is a
> >visual illustration); recognisability
>  >of text (words) in a text-to-speech modality (without necessarily > a 
> requirement to be able to display the text).
>
>Do you consider the information in the Techniques Document
>sufficient? What text would you add there?

Some of the proposals in the techniques doc go beyond what I would
suggest (see below). Also, the techniques doc says:

"a character has a value that may not be expressed in the user
agent's internal character encoding"

This sounds somewhat outdated. Although HTML 4.0 did not explicitly
require a User Agent to use Unicode, more and more browsers use it,
and for XHTML, there is no way around it.

Another point:
"HTTP headers provide information about content encoding ("Content-Encoding")"

Content-Encoding is about compressions such as gzip, and looks
inappropriate at this place.


> > Detection of an unsupported script or language would presumably be
>  >significantly aided by recognition of markup indicating a
> > language, or recognition of a range of Unicode code points (eg.
> > the set of Latin characters used in Welsh or African languages)
> > that are known not to be supported. Perhaps, therefore, it would
> > be worthwhile to add another requirement along the lines of: >"Ensure 
> recognition of any cues provided
>  >in markup relating to a change of language or script." Examples
> > would include xml:lang in XHTML, :lang in CSS, lang in HTML, etc.
>
>I think that falls under checkpoint 2.1: "Render content according
>to specification."
>
> > Note that there is no markup at the moment in xml or html that
>  > indicates a change of script, and there may never be.  The text
>  >'or script' was included above to cover any possibility of such a
>  >thing occurring in a future implementation, given the assumption
>  >that the guidelines are also aimed at people developing new
>technologies.
>
>Ok.

In my personal opinion, the chance for a 'script' attribute
are really small, because script can easily be inferred from
the actual character codes.


> > #i18n-3:
> > Checkpoint 2.10, checkpoint provision 2
> > It may be helpful for the user to append "because it is not in a
> > supported  language or script (i.e. writing system)" to the end of
> > this sentence (ie.  the UA should indicate the reason that the
> > text was lost) if one can assume  that the user agent knows that
> > it is because the text is in an unmanageable language or script.
>
>Yes, I agree that would be clearer.

I think having some way to indicate why the text was lost
is a good idea. But are there other ways the text can be
lost? Also, we should be careful that this does not imply
that the actual explanation has to be given. For example,
it would be a bad idea if a text-to-speech converter
would speak 'missing text here because it is not in a
supported  language'; what we would want to aim at is
that a special beep (or bell, or whatever other sound)
would indicate that there is such missing text, and that
another sound would be used for other cases.
(please note that I removed 'script' because it's
largely irrelevant (except for some very special cases,
i.e. a Mongolian text-to-speech renderer that can deal
with Mongolian in the Cyrillic script, but not in the
Mongolian script).



> > #i18n-4:
> > Checkpoint 4.1, Sufficient technique
> > Suggest: "render text at 36 points" -> "render Latin text at 36
> > points".  Reason: rendering Chinese or Arabic fonts at 36 points
> > may not produce the same degree of clarity as rendering Latin text
> >  at that size, and different settings may be more appropriate.
>
>Ok.

The text currently says "to configure a reference size for rendered
text (e.g., render text at 36 points unless otherwise specified)".

I think this is appropriate in most cases. It is definitely
true that different scripts can require different font sizes
for the same 'clarity'. But this can also be said of different
fonts for the same script.
The average user is probably more confused by having 'Latin'
turn up in a setting unless s/he is working already at a script-
specific level.



> > #i18n-5:
> > Checkpoint 4.2
> > Since global imposition of a Latin-only font could break text in
> > other scripts, perhaps this should be finessed to say that it
> > should be possible for the user to specify different user
> > preferred fonts by script group (much like eg. the common browsers
> >  allow you to set default fonts for Unicode ranges).
>
>I'm not sure that we need it as an additional UAAG requirement;
>this seems primarily to be an internationalization requirement.
>Rather than add this as a requirement, I suggest we make your
>point in the Techniques document.

To some extent, this is already covered in "1. For text that cannot be
rendered properly using the user's preferred font family, the user agent
may substitute an alternative font family."
The 'may' should probably be changed to a 'should'.

However, it is also not clear at what level the combination of
different glyphs for different scripts is going to be done
in the long run. It could be that browsers have to deal with
this for ages. It could also be that font composition/fallback
becomes part of font formats, font configuration tools, and the
OS in general. In that case, the user would only specify
"MyHelvetica" or some such in the browser, and would have
configured "MyHelvetica" with a font configuration tool
provided by the OS.


Regards,    Martin.
Received on Monday, 30 September 2002 04:55:18 UTC