W3C home > Mailing lists > Public > www-style@w3.org > May 2011

Re: [css3-fonts] humane 'unicode-range'

From: John Daggett <jdaggett@mozilla.com>
Date: Sun, 1 May 2011 18:32:45 -0700 (PDT)
To: Koji Ishii <kojiishi@gluesoft.co.jp>
Cc: CSS WWW Style <www-style@w3.org>, "CJK discussion (public-i18n-cjk@w3.org)" <public-i18n-cjk@w3.org>, Christoph P├Ąper <christoph.paeper@crissov.de>, markdavis@google.com
Message-ID: <711732579.182805.1304299965302.JavaMail.root@cm-mail03.mozilla.org>
Koji Ishii wrote [related to the use of range definitions in the syntax of the unicode-range descriptor]:

> EAW=A contains mostly punctuation and symbols that were unified. I
> think there are cases where authors want:
> * CJK fonts for EAW=A|F|H|W
> * Latin fonts for EAW=N|Na
>
> One famous example for EAW=A is U+2026 HORIZONTAL ELLIPSIS; ellipsis
> are at baseline in Latin fonts while ellipsis at the vertical center
> in CJK fonts. If I were writing Japanese documents, I expect it be
> drawn at the vertical center.
> 
> It would be great if I can use a font like this:
> 
> @font-resource {
>   font-family: myfont;
>   src: local(CJK-font-name);
> }
> 
> @font-resource {
>   font-family: myfont;
>   src: local(Latin-font-name);
>   unicode-range: EAW=N|Na;
> }

In your example I think you mean @font-face, no?

As Jonathan Kew points out, what you're asking for is syntax that defines
ranges based on specific properties in the Unicode database.  

For the use case of using one font for Latin, another for Japanese, this
doesn't really yield the ideal result.  The are lots of EAW=A characters
in the extended Latin ranges and EAW=N probably covers all sorts of ranges
that an author wasn't really considering (e.g. Syriac, Mandaic, Thai, Lao,
Tibetan).  Plus this syntax requires the author to understand the complexities
of the Unicode character database which I don't think is a great idea.

The actual original discussion was about simple named ranges [1], for
example using block definitions in the Unicode database (Blocks.txt)
to define simple block name ==> range substitutions:

@font-face {
  font-family: simplefont;
  src: local(JapaneseFont);
  /* implicit definition of unicode range as u+0:10ffff */
}
 
@font-face {
  font-family: simplefont;
  src: local(LatinFont);
  unicode-range: "Basic Latin", "Latin-1 Supplement";  /* equivalent to u+000:1ff */
}
 
I think the key weakness in these schemes is that it's hard to find
the ideal set of named mappings.  Using Unicode blocks or script
definitions doesn't give you a simple "Arabic" or "Latin" mapping and
there are common blocks to consider.  And it doesn't break up the CJK
ideographs block in interesting ways which is a very important use
of unicode-range.

There might be a simple way of defining named blocks that can be referenced
but I think more advanced ways of defining unicode ranges should be left
for a later version of the fonts spec in CSS.  Once unicode-range is actually
implemented and in use, I think the use cases for extensions will be much
clearer.

For CSS3 Fonts, we need to keep it simple and ship it! ;)

Regards,

John Daggett

[1] Original discussion of named ranges for unicode-range
http://lists.w3.org/Archives/Public/www-style/2009May/0212.html

[2] Unicode database
http://www.unicode.org/Public/UNIDATA/
http://www.unicode.org/reports/tr44/
Received on Monday, 2 May 2011 01:33:14 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:40 GMT