Re: [css3-fonts] Orthogonality relationships in generic font families from Nicholas Shanks on 2008-08-27 (www-style@w3.org from August 2008)

From: Nicholas Shanks <contact@nickshanks.com>
Date: Wed, 27 Aug 2008 23:55:42 +0200
To: www-style@w3.org
Message-Id: <97834A8C-BDE0-4B28-9E35-0AC820DD430E@nickshanks.com>
On 27 Aug 2008, at 10:52 pm, L. David Baron wrote:

> You'd need to be a little careful with conformance requirements
> here, since for some languages (e.g., Chinese, Japanese, Korean)
> basically everything is monospace.  (Or close to it... I'm not
> really an expert on that.

CJK fonts generally include proportional glyphs outside of the CJK  
ranges (usually Latin ones), even while the Han glyphs are all the  
same width.
Fonts such as Osaka Mono have monospace Latin glyphs (the same width  
as the CJK ones).
Specifying a monospace font for bilitteral Japanese/European text  
could allow the UA to choose Osaka Mono for all runs (vs mixing Kaku  
Gothic W3 and Courier, say).

>> Similarly, the introduction of the ‘proportional’ keyword is to  
>> override
>> the inherited monospace state (in my example all elements in the
>> headerdoc class would also be in the comment class). It would not  
>> alter
>> the serif/sans‐serif axis state, so the sans‐serif state of class
>> .comment would be inherited when not explicitly stated.
>
> Having half the state be inherited even when the property is
> specified doesn't really fit with the CSS processing model.  Is this
> really essential to meet your requirements?

Perhaps not. Do you have different ideas?
It could probably be omitted, as was just a shortcut to stop the  
monospace trait being inherited when not desired, but always  
specifying both axes would make this unnecessary.

> This proposal poses some interesting backwards compatibility issues.
> However, they're not so horrible if:
>
> * we relax (how much?) the restriction that only one generic
>   family is allowed in a 'font-family' list (at the end)

That was kinda implicit in the proposal. You would need one generic  
family per orthogonal axis at the end of font lists, but I am only  
suggesting two axes here, so no more than 2.

> * we assume that there aren't any fonts called "monospace serif",
>   "sans-serif monospace", etc.

We are already assuming there aren't any fonts called "Fantasy",  
"Cursive" etc.
I would suppose that the longer the generic family names are, the  
lower the probability of a clash, with a real font. (And again, a font  
called "sans-serif monospace" is probably suitable if you wanted a  
sans-serif, monospaced font.)

> Do the metadata in fonts and the platform APIs used to access those
> metadata typically allow access to the information needed to
> implement this?

Font metadata does, and the monospace/proportional information is  
available via an API call on the Mac OS, gathered from the ‘OS/2’  
table in TT/OT formats, combined with system override lists for known  
malformed fonts. For serif/sans‐serif, I believe browsers have to  
either maintain their own lists or query the font PANOSE numbers  
directly at present, but that is necessary for the existing CSS2.1  
implementation anyway. I do not know how serif/sans‐serif is  
currently determined for CJK fonts (I would guess either it isn't, or  
via lists of known fonts.)

> It's also not clear what these would do for some other languages
> that don't necessarily make these distinctions; serif vs. sans-serif
> there is already a bit odd.

Essentially all Arabic, Syriac or Thaana fonts would be treated as if  
they were in both categories, for example, so it wouldn't matter which  
is specified. Technically, a UA could just ignore the distinction if  
the characters it is processing belong to a non‐Latin range.

> If there's information in font metadata
> that gives this, then the font designers for those languages
> probably have a convention for mapping serif and sans-serif to some
> other characteristic.  But if there's not, it's probably quite hard
> for this to be implemented just for Latin script, never mind for
> lots of other scripts.


The information is in what are called PANOSE numbers, a sequence of  
ten integers where the value of the first affects the meaning of the  
subsequent ones.
The first digit switches between { Any = 0, Does Not Fit = 1, Latin  
Text = 2, Latin Handwritten = 3, Latin Decorative = 4, Latin Symbol =  
5 }
‘Any’ seems to correspond to a meaning of ‘Author couldn't be  
bothered to set this value’ in practice.
Arabic fonts should select Does Not Fit. The meanings other 9 digits  
are then undefined.
Latin book fonts would choose Latin Text (value of 2).
This choice then makes the meaning of the second digit be the style of  
the serifs.
Again, an array of values like Slab and Flared, with Does Not Fit  
available for sans-serif faces.
For Handwritten fonts (=3), the second digit describes the type of pen  
(ball‐point, felt‐tip, etc), whilst the seventh digit gives choices  
between Roman (e.g. Segoe Print), Cursive (e.g. Segoe Script) and  
Blackletter. I think this means any attempt to derive correspondences  
between the CSS serifs trait and the cursive family may require  
additional data from the UA or OS vendor.

The PANOSE specification is built into TrueType and OpenType, but  
basically only provides metadata for Latin text. (The weight, slant  
angle, monospace flag and such are elsewhere and apply to non‐Latin  
fonts just as equally. PANOSE contains a monospace number but OSes  
ignore that in favour of the other flag.)

— Nicholas Shanks.
Received on Wednesday, 27 August 2008 21:56:24 UTC