W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] HTML: A DOM attribute that returns the language of a node

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Thu, 08 Aug 2013 17:29:31 +0300
Message-ID: <5203AB4B.6000405@cs.tut.fi>
To: whatwg <whatwg@lists.whatwg.org>
2013-08-08 2:57, Ryosuke Niwa wrote:

> On Aug 2, 2013, at 6:10 AM, Jukka K. Korpela <jkorpela@cs.tut.fi>
> wrote:
[...]
>> But regarding the effect of language markup on fonts, the effect is
>> limited to situations where the font is not specified in a style
>> sheet. This is a rather uncommon scenario these days; authors are
>> more than eager to set fonts.
>
> Do you have actual statistics to support this point?

No, itís just an impression from looking at numerous pages and their 
coding as well as views presented in authorsí forums.

> As far as I
> checked, neither baidu.com nor yahoo.com.tw seems to explicitly
> specify a Chinese font.

They both have font-family settings, slightly different, but basically 
the most common (sorry, no statistic on this either) setup that uses 
Arial (possibly with Helvetica as second option, which does not change 
much). So, granted, they donít specify a Chinese font in the sense of 
including any specific fonts containing CJK characters in the 
font-family list.

Baidu doesnít set lang either, so they seem to be accepting, for any 
characters not covered by Arial, whatever happens to be in each 
browserís list of fallback fonts, when no information about content 
language is available. Yahoo.com.tw sets lang="zh-tw", so they do care, 
but only to the extent that the fallback font should be one intended for 
Traditional Chinese.

So the lang markup may affect fonts, but only under some conditions. And 
if you care about fonts, as an author, then an explicit list of font 
alternatives has better chances of creating the desired result.

>> It is true that they might specify a font list where none of the
>> fonts supports some characters that might be entered, and then a
>> fallback font would be used. However, using ďannotationsĒ
>> (presumably, lang attributes, along with extra <span> elements when
>> needed) does not sound like a feasible approach to this.
>
> Whether itís feasible or not, thatís what we have been doing due to
> the Han unification.  If we could, weíll undo the Han unification and
> use different glyphs for each character but we canít do that at this
> point in time.

If a page contains texts to be rendered using different forms 
(Traditional Chinese, Simplified Chinese, Japanese, Korean) for Han 
characters, you will need to control the rendering somehow. Using lang 
markup might be theoretically most adequate, but itís indirect and less 
effective than just setting different fonts (via font-family lists that 
contain reasonably many alternatives).

But even if lang attributes are used, I donít think the issue has much 
relevance to the original question here. A DOM attribute that returns 
the language of a node would be useful for the purpose only if you 
intend to affect rendering via JavaScript.

Yucca
Received on Thursday, 8 August 2013 14:29:56 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:23 UTC