- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Thu, 1 Apr 2004 12:22:14 +0300 (EEST)
- To: www-html@w3.org
On Thu, 1 Apr 2004, David Woolley wrote: > > <del>wrong</del> poor man usage in French. > > > > <p>C'est incroyable !</p> > > This is also common amongst non-professional typists in the UK. There aren't that many alternatives in practice. Both leaving a normal space and not leaving any space are wrong solutions, and using a thin space is questionable for a couple reasons. > > Typographic rules: You have often a =93espace fine=94 between the = > > character=20 > > In my view, that space should be inserted by the renderer, not by > the author, in the same way as the renderer may kern or ligature > characters. That would mean that the browser needs to know the language (either from markup or by deducing from the content) and to know the punctuation rules of the language used and to process the text to see whether the rules need to be applied. I'm afraid this is asking for too much. Even version 3.0 of the Unicode standard, while mentioning the spacing issue, contained an example of a quotation in French, with no spacing between the punctuation marks and the quoted text. If they didn't get it right (and the example has now been removed in version 4), how could browsers generally deal with such things. Punctuation is part of textual content. It is odd that Unicode has no character for the French espace fine. The thin space won't do, since it explicitly permits a line break after it, and the narrow no-break space is of undefined width and seems to be intended for use in other writing systems. It is not feasible to consider it as just a presentational variant of a space (or no-break space), since French uses both the normal space and the fine space, and the difference is not comparable to, say, variation of the width of a space by glyph or due to justification. So there's no simple solution. The practical solution at present to decide between no spacing and a no-break space, though in special cases (e.g., in headings where such things really matter), some complicated methods might be considered (see http://www.cs.tut.fi/~jkorpela/html/french.html ). What this implies, in my opinion, is that the problem is inherently at the character level and should be fixed by adding a fine space character to Unicode. Meanwhile, there is little to be done in HTML. In theory, if entities were really used in the manner outlined in the SGML standard, a new entity, say &fsp;, could be added, with a definition that makes it expand to  ⁠ (i.e., en space followed by word joiner), to be maybe replaced by a new definition using a new character, and to be implemented in browsers using a different definition (say, just no-break space) if they cannot implement the proper definition. But this would hardly be a working solution to anything. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Thursday, 1 April 2004 04:22:46 UTC