Re: [XHTML2] Specifying alternative resources/content from Risto Kankkunen on 2004-10-19 (www-html@w3.org from October 2004)

From: Risto Kankkunen <risto.kankkunen@iki.fi>
Date: Tue, 19 Oct 2004 19:58:20 +0300
To: Edward Lass <elass@goer.state.ny.us>
CC: www-html@w3.org
Message-ID: <417547AC.8020302@iki.fi>
Edward Lass <elass@goer.state.ny.us> wrote:

>>> 1. It provides alternative representations only for remote 
>>> resources
> 
> In theory, HTTP Content Negotiation and XML encoding information 
> should already make sure that the user agent can handle everything 
> that an XHTML page throws at it.

First of all, HTTP Content Negotiation works of course only when HTTP is
used. XHTML is not meant to be used only with HTTP.

Secondly, HTTP Content Negotiation, as well as the draft's mechanism,
only works for external documents. Surely you don't mean that the author
should prepare a number of different versions of a large document, just
because it happens to contain a couple of phrases in a couple of
different languages? This is just the reason we need a mechanism like
"<alt>" that isn't tied into the external objects.

> In practice, when I look at the W3C HTML Home Page in Firefox for 
> Windows, Masayasu Ishikawa's name in Japanese appears as a series of
> question marks.

It's great that you pointed out this real-life problem you are having,
because it's just one of those things that "<alt>" could fix.

When I go to that page, I get a dialog saying "To display language 
characters correctly, you need to install the following components: 
Japanese Text Display" and see the name as question marks. So the 
browser knows very well it cannot render the name properly. But since it 
doesn't have any other info to go on, it just puts up a number of 
question marks. If it was possible for the author to provide an inline 
alternative to the Japanese name via "<alt>" tag, the browser could use 
that information to make a better job. Even this simple addition would 
be better:

   <li>
       <a href="../People/mimasa/" lang="ja" lang="ja">
           <alt>
               <span>石川 雅康 (ISHIKAWA Masayasu)</span>
               <span>ISHIKAWA Masayasu</span>
           </alt>
       </a> is the HTML Activity Lead and the Team Contact
       for the HTML Working Group
   </li>


> However, from the example:
> 
> <!-- in Cyrillic letters --> <p xml:lang="ru"> ? ???? ????????! </p>
> 
> <!-- Latin transliteration --> <p xml:lang="ru"> S dnem rozhden'ja! 
> </p>
> 
> Obviously the xml:lang attribute doesn't include encoding or font 
> information, so a user agent couldn't be expected to reject the 
> Cyrillic letters...

I don't know, if it was obvious, but the Cyrillic version uses UTF-8
encoding, the one used by XML by default. So every user agent should be
able to parse that, but without the correct font cannot necessarily
display it (I don't think there are many people who have a font
installed that contains all the Unicode characters). I don't know what 
you mean by saying "a user agent couldn't be expected to reject the 
Cyrillic letters". The user agent has no other alternative, if it 
doesn't support Cyrillic, and it is craving for instructions by the 
document author of what to do.

> Is this problem common enough that a scripting solution is 
> inadequate? Probably not.

I'm interested to see how you solve this with "a scripting solution".
It's also interesting to see, that you don't think multilanguage
documents are common...

I hope the consensus is more like chapter "1.1.1 Design Aims" of the 
draft describes:

   # Better internationalization: since it is a World Wide Web.

   # Less scripting: achieving functionality through scripting is
     difficult for the author and restricts the type of user agent
     you can use to view the document. We have tried to identify current
     typical usage, and include those usages in markup.

Regards,
Risto Kankkunen
Received on Tuesday, 19 October 2004 16:58:17 UTC