- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Tue, 30 Mar 2004 20:53:10 +0300 (EEST)
- To: www-html@w3.org
On Tue, 30 Mar 2004 olafBuddenhagen@web.de wrote: > On Mon, Feb 09, 2004 at 10:46:45PM +0200, Jukka K. Korpela wrote: > > > However, regarding HTML, the question arises whether <nobr> should be > > regarded as structural, at least when used for expressions like %7E, > > which may _change meaning_ when broken into % and 7E. > > This should be handled by <code>, I thought we went through this when the discussion was active. There is nothing in the definition of <code> that relates to line breaking. Some computer code systems (or "languages") might have their own rules, but there isn't even in principle a way to indicate the code system used inside <code>. > not by a <nobr>, which is purely presentational. This was discussed too. Is purely presentational? If it is, shouldn't it be deprecated in favor of CSS? > In semantical markup, the question you need to ask > yourself is always: *Why* do I not want this to be broken? Because it is > code! No, it's because its meaning changes if a line break is inserted. Whether it is code or not (whatever that means in detail) is orthogonal to this. > Generally, WHY is the key to semantical markup... > > > Or for expressions like -1. > > If Unicode linebreaking rules are any good (I do not know them), Sorry, but if you don't know them, you don't even understand the problem. Still less can you evaluate the suggested solutions. > the > problem is actually a different one: Nobody but professional typesetters > do know and respect the five or so different types of dash-like > characters, all fulfilling a different purpose, and all having a > different character code in Unicode (I guess). That's a problem (to the extent that it is true), and it indeed is a completely separate problem. > However, once you actually start to consider the fact that -1 shouldn't > be broken, you'll probably also consider the fact that minus is > something different than a dash or a hyphen... What makes you think that in "-1", the "-" is inevitably just a surrogate for minus? Besides, the Unicode standard actually defines "-" as hyphen-minus, as a character with dual (or actually multiple) usage. Yet the Unicode line breaking rules play their own game, forgetting that duality. > Anyways, I do not see any good solution for this. We probably can't > teach every web author to use Unicode correctly, Well, HTML is already based on Unicode as regards to characters. But it need not adopt all the strange definitions like line breaking rules which mostly just break things. And using <nobr> is a very simple solution, already implemented. It does not prevent the elaboration of more sophisticated methods, if desired. Reluctance to make <nobr> part of the specification conveys a message: authors are not expected to defeat clueless line breaking algorithms applied by browsers, except perhaps in a clumsy way by making optional presentational suggestions in CSS, which usually means adding extra <span> markup. (Effectively, <span> with style="..." or class="..." mostly indicates either lack of suitable markup, or lack of attempts to find suitable markup. Which one would <span style="white-space:nowrap">-a</span> be? > but we can't ignore > Unicode either if we want to have any reasonable language handling... Huh? Language handling surely depends on quite different issues. Mostly, about building actual support to languages into browsers. > Overriding Unicode rules won't do. If Unicode line breaking rules are regarded as something that should not be overridden, the results will be grotesque. They are already bad as what they were probably meant to be, a simple general basis upon which you could build your own linebreaking rules, if you find the basis suitable. > > if a document e.g. discusses the command "rm -r /usr/spool/foo", > > <code> again. And how do you expect or want that to affect line breaking? -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Tuesday, 30 March 2004 12:53:23 UTC