W3C home > Mailing lists > Public > www-style@w3.org > December 2012

Re: CSS needs improvements for handling sentence spacing

From: Tab Atkins Jr. <jackalmage@gmail.com>
Date: Thu, 20 Dec 2012 12:54:49 -0800
Message-ID: <CAAWBYDBURGFqSBo8aqDw9Q0CwfQT3xpOaA4G5-TbZVEFA2qjtQ@mail.gmail.com>
To: "Thomas A. Fine" <fine@head.cfa.harvard.edu>
Cc: www-style list <www-style@w3.org>
On Thu, Dec 20, 2012 at 12:07 PM, Thomas A. Fine
<fine@head.cfa.harvard.edu> wrote:
> Suppose I have an interest in formatting additional space between sentences.
> I might want to do this because:
>   * I want to approximate the historic norm for published works from roughly
> 1650 to 1950
>   * There is some evidence to support that such spacing is helpful for new
> readers, people with certain learning disabilities, and more generally
> people who are speed-reading or scanning.
>   * Or just because I find it aesthetically pleasing
>
> The most popularly recommended solution is a non-CSS solution, to use the
> &nbsp entity to add an extra non-collapsing space.  The non-breaking space
> messes up justification where it occurs at line wraps.  Even if this is
> changed to some other space entity, or just a regular space together with
> setting white-space to pre-wrap, this is still not a CSS solution, and
> allows for no fine-grained control.
>
> Unfortunately CSS lacks a reasonable approach for accomplishing this
> seemingly simple feat.  Using the box model on sentence spans does not work
> properly, because space added in the box model is not wrappable white space,
> and as such it messes up justification.
>
> One method does work, but in my opinion is not acceptable.  You can set
> word-spacing to a wide value for your divs or paragraphs, and then reset it
> back to a normal value for every sentence or other contained element.  This
> approach to me seems inverted.  You are effectively setting word-spacing to
> an incorrect value overall and then correcting it farther down.  This is
> confusing and error-prone.
>
> Possible CSS solutions:
>
> #1. If a sentence tag existed, then a "sentence-spacing" parameter could be
> used to adjust sentence spacing wherever two sentences touch.  In the
> absence of such a tag, it could still be possible for a user to specify
> another CSS parameter that describes which spans should be considered
> sentences, e.g. "sentence-span: .sntc" would consider anything of that class
> to be a sentence.
>
> #2. A generic inline spacing parameter could be created for all inline
> elements.  This would be more powerful because you could then customize
> spacing around any phrase or inline image or anything else, but also more
> complicated, as there would have to be some method for two connecting
> elements to negotiate the space between them (e.g. averaging, larger one
> wins, or that it only applies to leading or following space, etc.)
>
> #3. If an unambiguous full-stop entity (&fullstop) or unicode character
> existed, this could reliably be used to format sentences.  Alternatively
> this could also be an unambiguous inter-sentence space (&sentencespace).
>
> #4  You could also put a tag only around the white space between sentences,
> and use CSS to control that word-spacing.  This does solve the problem of
> inverting your model that my current word-spacing solution would have, but
> I'm not sure that having a piece of content consisting solely of white space
> makes any sense.
>
> #5. This could be handled through automatic sentence detection.
> Unfortunately these algorithms are not reliable because periods are
> ambiguous, and more generally control is not in the hands of the content
> creator.  On the other hand, it would instantly give sentence spacing
> ability to every existing HTML document with only a single change. Also,
> "period-newline" or "period-space-space" could be used as methods of
> detecting sentences more reliably within the HTML source.
>
> This approach could also be a fallback option within solution #1 above (e.g.
> "sentence-span: auto").  Or alternatively, with automatic sentence detection
> there could be a method of tagging only those things the algorithm might get
> wrong, e.g.  "<NotAFullStop>. </NotAFullStop>"
>
> For the record I prefer a sentence tag or span around actual sentences as
> this has the side-effect of providing semantic information on where each
> sentence starts and ends, and no other solution provides that.

This sounds like a reasonable request, but not one that CSS can do
automatically, as there's no way to generically tell when a sentence
ends just from simple text analysis.

Your suggested solution of a sentence wrapper is fine, and works today
- just wrap sentences in a <span class=sentence> and add a ".sentence
{ margin-right: .3em; }" rule to your stylesheet.

Your other suggestion solution of using a special unicode character
for the extra spacing is *also* fine, and also works today.  ^_^
Unicode has several larger space characters
<http://www.cs.tut.fi/~jkorpela/chars/spaces.html>, so you can just
choose one that feels appropriate.  I might recommend using an en
space &ensp;, as it seems to be about the right size.

~TJ
Received on Thursday, 20 December 2012 20:55:36 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:21:03 GMT