Re: nbsp and typography rules Re: <NOBR> - Returning to the question....

On Thu, 1 Apr 2004, David Woolley wrote:

> > &nbsp; <del>wrong</del> poor man usage in French.
> >
> > 	<p>C'est incroyable&nbsp;!</p>
>
> This is also common amongst non-professional typists in the UK.

There aren't that many alternatives in practice. Both leaving a normal
space and not leaving any space are wrong solutions, and using a thin
space is questionable for a couple reasons.

> > Typographic rules: You have often a =93espace fine=94 between the =
> > character=20
>
> In my view, that space should be inserted by the renderer, not by
> the author, in the same way as the renderer may kern or ligature
> characters.

That would mean that the browser needs to know the language (either from
markup or by deducing from the content) and to know the punctuation rules
of the language used and to process the text to see whether the rules need
to be applied. I'm afraid this is asking for too much. Even version 3.0 of
the Unicode standard, while mentioning the spacing issue, contained an
example of a quotation in French, with no spacing between the punctuation
marks and the quoted text. If they didn't get it right (and the example
has now been removed in version 4), how could browsers generally deal with
such things.

Punctuation is part of textual content. It is odd that Unicode has no
character for the French espace fine. The thin space won't do, since it
explicitly permits a line break after it, and the narrow no-break space is
of undefined width and seems to be intended for use in other writing
systems. It is not feasible to consider it as just a presentational
variant of a space (or no-break space), since French uses both the normal
space and the fine space, and the difference is not comparable to, say,
variation of the width of a space by glyph or due to justification.

So there's no simple solution. The practical solution at present to decide
between no spacing and a no-break space, though in special cases (e.g., in
headings where such things really matter), some complicated methods might
be considered (see http://www.cs.tut.fi/~jkorpela/html/french.html ).

What this implies, in my opinion, is that the problem is inherently at the
character level and should be fixed by adding a fine space character to
Unicode. Meanwhile, there is little to be done in HTML. In theory, if
entities were really used in the manner outlined in the SGML standard, a
new entity, say &fsp;, could be added, with a definition that makes it
expand to &#x2009;&#x2060; (i.e., en space followed by word joiner), to be
maybe replaced by a new definition using a new character, and to be
implemented in browsers using a different definition (say, just no-break
space) if they cannot implement the proper definition. But this would
hardly be a working solution to anything.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Thursday, 1 April 2004 04:22:46 UTC