W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2013

Re: [whatwg] HTML: A DOM attribute that returns the language of a node

From: Ryosuke Niwa <rniwa@apple.com>
Date: Wed, 07 Aug 2013 16:57:35 -0700
Message-id: <CB1B8D78-F86D-4C64-B690-59476EE877A0@apple.com>
To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
Cc: whatwg@lists.whatwg.org

On Aug 2, 2013, at 6:10 AM, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote:

> 2013-08-02 2:43, Ryosuke Niwa wrote:
> 
>>> Are you saying that for HTML contenteditable-based editors that want to
>>> support drag-and-drop editing, they need to be able to annotate the
>>> outgoing HTML fragment with the effective language so that when it's
>>> embedded somewhere, the right fonts get used?
>> 
>> Yes, but not just for drag and drop.
> 
> This would mean that the editor would have to guess the language from the text or ask the user to specify it. This is not as unrealistic as it may first seem. Microsoft Word does such things, sometimes getting things right, often messing things up. It typically detects change of language too late, and often infers language from keyboard settings, making it rather impossible to use a multilingual keyboard easily.
> 
> But regarding the effect of language markup on fonts, the effect is limited to situations where the font is not specified in a style sheet. This is a rather uncommon scenario these days; authors are more than eager to set fonts.

Do you have actual statistics to support this point?  As far as I checked, neither baidu.com nor yahoo.com.tw seems to explicitly specify a Chinese font.

Also, I have just recently experienced the font type change on Gmail when I was conversing with a native Chinese speaker.  Her mail client used Chinese fonts before Japanese fonts whereas mine had Japanese fonts before Chinese fonts.

> It is true that they might specify a font list where none of the fonts supports some characters that might be entered, and then a fallback font would be used. However, using ďannotationsĒ (presumably, lang attributes, along with extra <span> elements when needed) does not sound like a feasible approach to this.

Whether itís feasible or not, thatís what we have been doing due to the Han unification.  If we could, weíll undo the Han unification and use different glyphs for each character but we canít do that at this point in time.

- R. Niwa
Received on Wednesday, 7 August 2013 23:57:59 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:23 UTC