CSS and Type

This mail attempts to truncate an article I'm writing into the core issues
I have noticed about CSS and type - I'm posting it on the advice of Anton
Prowse as we were talking about line-height issues in:
http://lists.w3.org/Archives/Public/www-style/2012Jan/0367.html

tl;dr:

1) Type measures should be related to the typeface in use (as opposed to
absolutely defined)
2) CSS's type layout is crippled because it does not know enough about the
typographic baseline.

The actual article:

We need a brief idea of how type works in order to discuss CSS’s
shortcomings, so lets quickly and somewhat roughly deconstruct type:

_Anatomy of type_

Type at its core is about the presentation of letters. Letters which make
words, sentences, paragraphs, headings, etc. Letters are made of smaller
parts, and they are these:

* X-height - which rather conveniently and almost as though cunningly
planned is the height of a lowercase letter X.
* Ascenders and descenders - those are the parts of the letter that go
above or below the core of the letter. An ascender would be the stem on a d
and a descender would be the stem on a q.
* Cap-height - which I am sure you’ll be unsurprised to learn is the height
of a capital letter. Interestingly, the cap-height is often not the same as
the ascender height.

In order to get letters to line-up horizontally when put next to each-other
they have a “baseline”, which all letters within a given family share (much
like they share the same x-height etc.). The baseline is the invisible line
along which all letters are aligned and from which our descenders are
descending. It’s the bottom of an x. The baseline is in the same place for
all letters in a family, the purpose being that they all line up nicely.

Size in traditional type has its own units. One is the “point” which is a
physical measure (1/72th of an inch) and is pretty much irrelevant for web
designers because we don’t work with physical mediums. So feel free to
ignore “points”. The other unit is the “em”, which traditionally was the
height of the metal slug onto which type was cast. We still care about em’s
despite not having metal slugs with embossed letters on them. For modern
purposes, an em is not just the height of a letter from top-most ascender
to bottom-most descender but includes the addition of the space between
baselines (called leading).

_How type is set in print_

In many cases type is the root of the entire page layout and becomes the
unit against which all else is measured and from which all else is derived.
And at the root of type layout, more often than not, is the typographic
baseline. The page is divided up into regular lines onto which all type
must sit and into which all images must fit. By doing this, the page
naturally feels more clearly arranged, cohesive, and planned.

A couple of quirks:

Because of the way type is measured (the em box) it means there isn’t any
real consistency between typefaces, even if the measured units are the
same. For example, the x-height may be twice the size on one typeface than
it is on another despite them having the same em height. Which is why
different typefaces can look smaller than others despite them being the
same size.

Another thing that can change inside a font is the baseline. It can be
lower or higher on one typeface than another. For us web designers, this is
a big problem. But it’s not a big deal in print, because in print you can
guarantee that the typeface you want to use will be the typeface that is
actually used. Which means you can make any adjustments you like to ensure
that the wayward font gets adjusted so the alignment of the two typefaces
baselines match.

_Issues CSS has that print doesn't_

One is the availability of a desired font. We can’t ensure that a client’s
computer has the font we want to use, and so we have to provide a
font-stack with fall-backs that we deem acceptably similar. You might think
that having @font-face mitigates this problem, but you’d be wrong - you
still need to give fall-backs. You don’t know whether the browser supports
@font-face. You don’t know if the network the client is behind is filtering
out suspicious file formats (colleges often do this). If you’re using a
hosted font from a font-foundry you can’t be sure that the font-foundry’s
network is going to be available to deliver the font (they may have their
own network problems). Even if you’re self-hosting the font you don’t know
whether the font was corrupted during transfer (as can happen on mobile
networks especially). In short, you can never be sure that the browser got
the font file you want to use. So, define fall-backs.

Another issue is the fact that from a typographic standpoint CSS is far
more often used to define rules that will apply to some unknown text than
it is to apply styling to known text. i.e., the print designer is styling
an exact heading while we are styling a heading class with no idea if it
will consist of two words or twenty.

_The problems with CSS’s type functionality_

Ignoring some of the typographic subtleties which CSS doesn’t handle so
well there are two basic problems:

1) The way fonts are declared

The programatic nature of CSS’s type implementation sits at odds with how
type works. In particular, font substitution is a major stumbling block. In
CSS we set a font-family as a list of possible fonts in order of declining
preference. Then, usually in separate rules, we set things like font-sizes,
weights, margins, and padding on HTML elements.

The first problem with this method is the font-family list itself; it’s a
pretty sad list because, almost by design, the default system fonts share
little in common. Thus making “substitution of similar fonts” something of
an impossibility. But the larger issue is that should a fallback font be in
use it is highly likely that the apparent size of the font will be wrong,
the baselines may not match up, and the relationship between the typeface
and the measures we’ve defined will look wrong. That’s because all of those
measures we set up are based on how our preferred typeface looks, and
because the font-stack is unlikely to provide similar enough typefaces
those measures are unlikely to work well with the substitute font.

2) The lack of baseline control

Much like in old-school physical print, CSS uses the em box as a unit of
measure. When you set a size in CSS, it’s the size of the em box you’re
actually setting. So far so good. Unfortunately CSS has little concept of a
baseline, and that’s a problem - because although type is measured in em’s,
it is not em boundaries to which type is aligned. It is alignment to the
document’s baseline which is important. People dealing with physical
mediums will offset the text to ensure that all text, no matter it’s size
or different measures, hits that baseline correctly. We can’t.

The extent of the control we have is using vertical-align, which defaults
to baseline. Unfortunately, vertical-align has no effect on any element
that’s set to display block. Which includes paragraphs, lists, headings -
all the stuff we care about. It’s only of any use with things set inline or
table-cell. Which doesn’t interest us much here.

Another part of CSS’s baseline problem is that as soon as we insert some
content that is not a multiple of the baseline all text that flows
afterward is dislodged from the overall baseline. So, as soon as you
include an image, video, or external resource the chances are very good
that rest of your page will not align correctly. This would not happen in
print.

_Other issues_

There are a number of other issues that we have with type that are less
fundamental but still important:

Orphan control - In professionally made print media you will never see a
single lonesome word dropped on a new-line when it’s part of a sentence.
CSS does not offer orphan control, so we often see them, and they look as
weird and wrong on screen as they would in print. CSS has had a widow
property for years, but it applies only to paged media, and for some reason
the orphan control was never implemented, despite its similarity.

Justified text - Any typographer will tell you to simply avoid using it on
the web. That’s because the implementation is so poor compared to what a
real design package would do. In CSS we get huge gaps between words that
produce “rivers” of space that are distracting and ugly, making the flow of
words stutter and become hard to read. Real publishing applications avoid
this rivering by applying automated adjustments to letter-spacing,
word-spacing, hyphenation and glyph scaling. CSS doesn’t do any of that,
making justified text on the web technically possible, but a crime of type
to actually use.

_How I’d like CSS type handling to change_

Fundamentally, the typeface should be the root of all other type related
measures, not divorced from them. CSS should be able to set different
values for margins, line-heights, paddings etc depending on the typeface
that’s in use. That’s exactly what would happen in print if the designer
was told to switch typeface - things would get re-flowed to suit the
typeface rather than be straight out swapped, especially if the new
typeface was a poor match for the preferred one.

I’d also like CSS to be more aware of the baseline. I want my text to align
to a given document or section baseline - so if I set the root element
baseline to 26px all type should be offset by whatever is required to make
the type’s baseline align with the root element’s baseline. CSS should also
be able to cope when an image is inserted that isn’t a multiple of the
baseline: text that wraps underneath should still align to the root
baseline.

I tried aligning to a typographic baseline on http://adaptive-images.com,
and had to jump through many hoops including making a javascript function
to apply corrected margins on images so that the text underneath regained
alignment. Even without this complication results are still varied: have a
look at that site on Firefox and then in Chrome or Safari. Notice how the
text’s baseline drifts off the document baseline between browsers? Aligning
type to a true typographic baseline is so hard to implement that it’s
effectively impossible on anything with dynamic content.

Received on Monday, 9 January 2012 15:15:00 UTC