Re: [HDP] Other comments from RI from Henri Sivonen on 2007-09-10 (public-html@w3.org from September 2007)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Mon, 10 Sep 2007 14:07:32 +0300
To: Richard Ishida <ishida@w3.org>
Cc: "'public-html'" <public-html@w3.org>
Message-Id: <E8C97C79-EC0F-4888-B24F-71F530E026F0@iki.fi>
On Aug 23, 2007, at 19:01, Richard Ishida wrote:

>> 11. Support World Languages
>
> I agree with much of this, but the intent of "Features to represent  
> a single web page in multiple languages are out of scope." is not  
> clear.  Much of the world is multilingual and we should definitely  
> allow for multiple languages and scripts to appear on the same  
> page.  This is one of the key rationales for using Unicode.  If  
> what is meant is that features to allow users to switch between  
> alternate translated versions of content stored within the same page,

My understanding (based on being around when the principle was  
drafted) was that the intent was to specifically exclude facilities  
for stuffing alternative translations for each string in a Web app UI  
in one HTML file and leaving the choice of translation to the client  
side.

> I also have a concern with "Italics is useful because it applies to  
> many bicameral scripts". Firstly, it applies not only to bicameral  
> scripts.

That it also applies to something else is rather besides the point.  
The point is the vast majority of Web content is in the Latin script  
(and Latin plus Cyrillic is an even larger block), so it would be  
silly to deny optimizations for the Latin script on the grounds of  
inapplicability to some other scripts. If those optimizations also  
benefit some non-bicameral scripts, that's great.

> Secondly, 'italics' is one kind of *presentational* device that, if  
> the default styles are not appropriate, should be applied using CSS.

That doesn't match the conceptual state of the art in the UIs of very  
popular content creation tools for bicameral scripts.

> The <i> tag does not equate to ruby markup at all -

It doesn't need to equate it. Both are sound examples of features  
that have a good reason to exist even though they aren't applicable  
to all scripts.

> ruby markup is semantic in nature (and needs CSS for all but the  
> most basic fallback default form of presentation).

I'm not familiar with the actual usage, but the Ruby itself says that  
the markup does not sufficiently encode semantics for aural  
presentation ( http://www.w3.org/TR/ruby/#non-visual ). To me, this  
suggests that the markup feature is primarily designed as a  
presentational feature for the visual media.

This is not a statement against Ruby. I'm just pointing out that Ruby  
isn't theoretically pure, either.

> A better sentence may be "Directional markup is useful because it  
> applies to many right to left scripts, even though content in some  
> scripts has no need of it."

The <i> example was inserted deliberately to avoid the use of the  
principle against <i>, which would be an easily foreseeable  
misapplication of the principle if the example was left out. (I  
suggested the example, so I know why it was meant for. :-) The  
directionality example is reasonable, but it fails to pre-empt  
misapplication against <i>.

> Certainly we should continue to support <b> and <i> tags, but we  
> should encourage people to use <em> and <strong> instead.

Merely renaming things and continuing to use them as before does not  
really solve anything technical.

> I constantly see people misusing these tags in ways that makes  
> localization difficult.
>
> Just because three presentational conventions (such as highlighting  
> emphasis, document titles, and foreign words) in an English  
> document may all use, say, italics, it doesn't hold that a Japanese  
> document will use a single alternative presentation for the three.   
> They may use wakiten for emphasis, but corner brackets for document  
> names, and guillemets for foreign words.  If the English author has  
> used <i> tags everywhere (thinking about the presentational  
> rendering he/she wants in English) rather than <em> and <span  
> class="doctitle" or <span class="foreignword", the Japanese  
> localizer will be unable to work easily with this document.

If the localizer is empowered the render the thoughts of the original  
author expressed in words into another language, why wouldn't/ 
shouldn't (s)he be empowered to change markup as well? And even if in  
some situations it makes sense for an author to be bound by rules  
that facilitate translation, why should those rules be baked into  
HTML to inconvenience people aren't writing to be translated?

> Think of it the other way around: Japanese authors may avoid both  
> italicization and bolding, since their characters are too  
> complicated to look good in small sizes with these effects - a  
> Japanese author may favour  underline for a wide variety of uses  
> and font tags for others (changes in text size and family can be  
> used to distinguish text in Japanese).  If the Japanese author uses  
> <u> tags for many different effects, it will become a problem to  
> localize into English, where judicious application of italicisation  
> here and bolding there (but no underlining) would look better.  All  
> this could be avoided if semantic markup was encouraged, allowing  
> the localizer into English to easily change the CSS and achieve  
> whatever effect they wanted.

Of course, that could, in turn, be avoided by allowing the localizer  
to easily change inline-level markup.

> Allowing authors to use <b> and <i> tags is also problematic in  
> that it blurs the idea of semantic markup in their mind with what  
> really are presentational devices associated with Western scripts.

If authors by and large don't see cost-outweighing benefits with  
unblurring their minds on this point, who are we to insist on  
unblurring their minds? And using devices associated with Western  
scripts is perfectly OK when writing in Western scripts.

Let's enable people to write in different languages using devices  
customary for those languages instead of depriving people of such  
devices because they aren't universally applicable.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Monday, 10 September 2007 11:07:54 UTC