- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Thu, 1 Apr 2004 11:08:08 +0300 (EEST)
- To: www-html@w3.org
On Wed, 31 Mar 2004, Ernest Cline wrote: > > I don't see why authors should use poorly supported tricky > > characters instead of simple markup that has worked for years, > > Poorly supported? Indeed. I call a character poorly supported, if the vast majority of users use browsers and settings that make a rectangle or a question mark appear in place of a character that should be an invisible joiner. > Perhaps by poor implementations, but again > these two characters have been around since at least Unicode 1.1. So? How long has the soft hyphen been in character standards, including the ISO 8859 set and Unicode? Just putting something into a standard does not magically turn it into a reality that specifications in other areas could rely on. > Any implementation of Unicode should support ZWNJ and ZWJ. > Now that I've had time to reflect on this, ZWSP and ZWNBSP are > really the preferred characters to do this as they affect only > line-breaking and nothing else. Quite right. So even you had to take some time to find out the really preferred characters. How about the vast majority of authors who have little or no idea of any character standards or any characters outside the normally used characters in the language(s) they ordinarily use? There's a huge difference between using (correctly) the invisible Unicode characters and just saying what you really mean, <nobr>...</nobr>. > What practical reasons? The practical reasons that <nobr> works almost always and causes no problems when it doesn't whereas the Unicode special characters, in addition to being far more complex to understand to authors, mostly do not work and usually break miserably when they don't. > Large areas of non-breaking behavior are > stylistic and should be handled as such. So you mean there's some virtue in using <span style= "white-space: nowrap">-1</span> instead of <nobr>-1</nobr>? I won't go into the details of white-space, which has always been poorly defined in CSS and still is (how does white-space play when there is no white space?). The main question is what the more complicated markup is supposed to benefit. It surely isn't more semantic; at the markup level, it only says "here is some inline content to which some style is attached". It by definition works less often, since CSS can be turned off. > Isolated incidents of overriding > the default behavior for semantic reasons can be handled via the > ZWSP, ZWNJ, ZWJ, and ZWNBSP characters from Unicode That might be what a theory says. But it's much more complex and does not work in practice, and won't work for many years widely enough to justify their use on normal Web pages. By the way, what about non-isolated incidents, like text containing lots of strings with all kinds of characters that permit line break before or after by the Unicode rules, like a-15%6/h\z?a (e.g., examples of passwords), but should not be broken into two lines? Should the author study, for each character, the Unicode rules, and maybe browsers' (mis)behavior too, to decide which characters need some linebreak prevention character before or after, or should be just put them between any two characters? -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Thursday, 1 April 2004 03:08:22 UTC