Em box and font data

Based on the discussion at the last CSS WG Telcon (28 July), there seemed to be a desire to add the CSS leading (line-height - font-sizd)/2 above and below the vertical extents of the Em Box.

It was noted that there is no definition of the Em Box and there was a concern expressed that fonts do not specify the position of the Em Box.

I have looked carefully at OpenType (and TrueType) fonts and have come the following conclusions.


1.       All TrueType/OpenType fonts specify a value for UnitPerEm. This is the number of Design Space coordinate values per Em and all coordinates of the font, ranging from the position of baselines to the control points on the outlines of glyphs, are expressed in Design Space coordinates. It is often true that the Latin baseline is positioned at the origin (0 point) in the vertical axis of the Design space, but that need not be true. That means that descenders (the parts of glyphs that are below the baseline) have negative coordinate values in the Design Space.

2.       All OpenType fonts and many TrueType fonts have values for sTypoAscender and sTypoDescender in the OS/2 table which are font (not glyph) metrics that specify the intended typographic upper extent and lower extent of the collection of glyphs in the font. It is recommended (but not required) that distance between the two values in design space be 1 Em (i.e., the distance between the two values total the number of UnitsPerEm as defined in 1. Above).

3.       Since the Stypo... values were added after the initial definition of TrueType fonts, there are fonts that lack these values. These fonts, however, should have values for Ascent and Descent in the HHEA table. These values were defined in the initial TrueType table structure. For Latin fonts, the Ascent value was defined as the topmost point (in Design Space Coordinates) on a lowercase "d" and the Descent was defined as the lowest point on the lowercase "p". There was not requirement nor recommendation that the distance between these points amount to the number of UnitsPerEm.

4.       There is a third set of values, usWinAscent and usWinDescent, which are Microsoft Windows specific values that specify (essentially) the top and bottom of the bounding box for the area marked by the glyphs in the font. These values are used for clipping the line to insure the all possible characters on the line will be completely visible. It is the existence of these metrics that has lead to text in 10.6.1 which says, "A UA may, e.g., use the em-box or the maximum ascender and descender of the font." These values are not appropriate for this discussion.

5.       In OpenType fonts, there is also a BASE table (and in Apples TrueType fonts a BSLN table) in which the coordinates of the various baselines (e.g. latin, hanging, ideographic, math, ...) are specified in Design Space Coordinates. When either of these tables exist, these coordinate values can be used to compute the offsets of the Ascent/sTypoAscent and Descent/sTypoDescent from any particular baseline. If the tables do not exist, then it is likely that the origin in the direction perpendicular to the direction of text flow in a line (the y direction for horizontal text flow) is where the default (usually Latin) baseline is positioned.

The CSS 2.1 spec, section 10.8.1 Leading and half-leading, specifies that:
The height and depth of the font above and below the baseline are assumed to be metrics that are contained in the font. (For more details, see CSS level 3.) [That is, there are values such as sTypeAscent/Ascent (read height above baseline) and sTypoDescent/Descent (read depth below baseline) in the font metrics.]

On an inline-level element, 'line-height' specifies the height that is used in the calculation of the line box height (except for inline replaced elements, where the height of the box is given by the 'height' property).

These "facts" suggest the following algorithm for computing the "effective" height of a line.

a)      Consider all the glyphs (and replaced content) that appears on the line. For each such object (glyph or replaced content or inline element) align it to the relevant baseline (as indicated by a vertical-align value or the default value of vertical-align for the type of object.

b)      For each object, use line-height to compute, relative to the font-size for that object, the leading to be applied to the object (1/2 above and below glyphs and none for the other objects as their extents already incorporate "leading").

c)       If the distance AD  from sTypeAscent (or lacking that the Ascent) value to the sTypeDescent (or lacking that the Descent) value is equal to UnitsPetEm, then position the leading above the ascent value and below the descent value.

d)      Else, if the distance AD is not equal to 1 Em add one half the difference between 1 Em and the distance AD(in Design Space Coordinates and it may be negative) to the ascent and decent  values and then position the leading at these new ascent' and descent' values as in step c), above.

This positioning to the top and bottom of the "leading" will ensure that a paragraph which text from a single font, at a single size will have a baseline to baseline distance (from line to line) that is exactly line-height in distance. (That is, the half leading at the bottom of the line above plus the half leading above the line blow plus the EM distance will equal the line-height).
Steve Zilles

Received on Wednesday, 4 August 2010 15:23:59 UTC