- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Tue, 30 Mar 2004 23:12:41 +0300 (EEST)
- To: www-html@w3.org
On Tue, 30 Mar 2004, L. David Baron wrote: > The hyphen-minus character > gets its own character class (HY), and breaks between HY followed by NU > (numeric character class) are forbidden. See rule LB18 in [2]. That's simply insufficient, since the hyphen-minus may well act as a minus sign in an algebraic expression like "-a". Moreover, it seems that the programmers of IE did not get the finer points - IE happily breaks between a hyphen-minus and a digit. I don't blame them too much, except for _attempting_ to implement something that shouldn't have been implemented at all, and surely not in a clueless manner that breaks two-character strings too, or in a manner that gets the rules wrong. > In summary, UAX #14 recommends that a line break be allowed before a > hyphen when the hyphen is preceded by a space, and after a hyphen if the > hyphen is followed by something other than a number. Pointless guesswork. In markup like HTML, we should not assume very much about the textual content. The good old rules of HTML 2.0 permitted line breaks at whitespace, and possibly after a hyphen (which was risky if applied improperly), and this should be enough _until you have something better_ that does not break too often. (Permitted line breaks in non-Latin writing systems are a different issue and deserve due consideration on a basis of observing the specifics of those writing systems, rather than Unicode linebreaking confusion.) So if I write "the normal plural suffix in English is '-s'", is it quite OK for a browser to insert a line break after the hyphen-minus? And am I supposed to introduce artificial <span> markup and some CSS code just to prevent that? Isn't enough to use <nobr>? It's not some funny span element I have in mind with some optional suggestion; I know pretty well that I just want to prevent the wrong line break. (And using the hyphen character, U+2010, would not help here. The Unicode linebreaking rules permit a line break after it.) > This seems like a reasonable compromise to me. But this is not about the reasonability of compromises (which is debatable as I've explained). It's about implicitly (or is there an explicit statement) imposing Unicode line breaking rules on HTML documents, without even giving authors any useful options of overriding them in simple HTML markup. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Tuesday, 30 March 2004 17:44:28 UTC