Re: <NOBR> - Returning to the question....

On Tue, 30 Mar 2004, L. David Baron wrote:

> The hyphen-minus character
> gets its own character class (HY), and breaks between HY followed by NU
> (numeric character class) are forbidden.  See rule LB18 in [2].

That's simply insufficient, since the hyphen-minus may well act as a minus
sign in an algebraic expression like "-a".

Moreover, it seems that the programmers of IE did not get the finer points
- IE happily breaks between a hyphen-minus and a digit. I don't blame them
too much, except for _attempting_ to implement something that shouldn't
have been implemented at all, and surely not in a clueless manner that
breaks two-character strings too, or in a manner that gets the rules
wrong.

> In summary, UAX #14 recommends that a line break be allowed before a
> hyphen when the hyphen is preceded by a space, and after a hyphen if the
> hyphen is followed by something other than a number.

Pointless guesswork. In markup like HTML, we should not assume very much
about the textual content. The good old rules of HTML 2.0 permitted line
breaks at whitespace, and possibly after a hyphen (which was risky if
applied improperly), and this should be enough _until you have something
better_ that does not break too often. (Permitted line breaks in non-Latin
writing systems are a different issue and deserve due consideration on a
basis of observing the specifics of those writing systems, rather than
Unicode linebreaking confusion.)

So if I write "the normal plural suffix in English is '-s'", is it quite
OK for a browser to insert a line break after the hyphen-minus? And am I
supposed to introduce artificial <span> markup and some CSS code just to
prevent that? Isn't enough to use <nobr>? It's not some funny span element
I have in mind with some optional suggestion; I know pretty well that I
just want to prevent the wrong line break. (And using the hyphen
character, U+2010, would not help here. The Unicode linebreaking rules
permit a line break after it.)

> This seems like a reasonable compromise to me.

But this is not about the reasonability of compromises (which is debatable
as I've explained). It's about implicitly (or is there an explicit
statement) imposing Unicode line breaking rules on HTML documents, without
even giving authors any useful options of overriding them in simple HTML
markup.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Tuesday, 30 March 2004 17:44:28 UTC