W3C home > Mailing lists > Public > www-international@w3.org > October to December 2017

Re: Does the html attribute "lang" implies the use of a specific script in the font?

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Tue, 7 Nov 2017 13:05:50 +0900
To: Philippe Cochy <acquadoria@gmail.com>, www-international@w3.org
Message-ID: <40740cf7-a879-e925-0e5a-9ab8f8601b1f@it.aoyama.ac.jp>
Hello Philippe,

It seems that your message somehow got caught in the moderator queue.

On 2017/11/06 16:24, Philippe Cochy wrote:
> Hello.
> Does the html attribute "lang" implies the unicode script in the font?


> i.e.
> Does lang="ja" => script="hani" and lang="JAN"?

No. Japanese can be written in Latin script, for example, even if this 
is not done very often.

> Does lang="fr" =>
> script="latn" and lang="FRA"?
> The (x)html attribute "lang" assign a language to an element. There is
> no attribute "script" in (x)html.

That's on purpose. There is no need for such an attribute, because for 
each character, it's clear which script it belongs to.

Something like
     <span script='Latn'>これは日本語です。</span>
would be contradictory, and something like
     <span script='Latn'>Kore ha Nihongo desu.</span>
would be redundant.

> In Unicode, languages are subsets of the scripts.

No. The relation between languages and scripts is more complex.

> Somes glyphs does not
> belong to a script (i.e. default script).
> In a font, script is the root of the use of language in tables. A
> language can be assigned to any script, including obviously to the
> default script. So for example the french language can be assign to
> latin but also to default or hiranga script.
> My english is very weak. Please consult (and participate to)
> https://bugs.chromium.org/p/chromium/issues/detail?id=3D779374
> to clarify this question.

The correct link is

Regards,    Martin.

> Regards,
> Philippe
Received on Tuesday, 7 November 2017 04:06:25 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 7 November 2017 04:06:28 UTC