- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Sun, 4 May 2003 09:10:30 +0300 (EEST)
- To: www-style@w3.org
On Sat, 3 May 2003, fantasai wrote: > What I describe as 'strict' could easily be a UA's 'normal' behavior. I have no idea of what you mean by that. The proposed values "strict" and "normal" are distinct, and that's an essential distinction. Are you saying that "normal" could be treated by a UA as the same as "strict"? > 'normal', however, allows the UA more freedom to define its algorithm, Any value allows considerable freedom in the actual layout algorithms, since the value only specifies _permitted_ line breaking points. > as long as it keeps within the limits set by UAX 14. This raises a question of quality. If we take the extremistic point that the line breaking properties really specify line breaking opportunities only, then a browser that presents any paragraph as a single line would conform. On the other hand one might say that any line that exceeds the available width _must_ be broken by the UA if there is a line breaking opportunity inside it - though the UA could still make its own decision on _where_ to break it. I'll skip fine tuning here; advanced layout algorithms may accept lines that exceed the overall line length limit to a small amount, if this considerably improves the situation on other lines. (It _is_ fine tuning and cannot refute my argument.) This implies that if a value is defined by a reference to UAX 14, then implementations _must_ use _any_ UAX 14 line breaking opportunity at least in the situation where that is the only way to deal with a particular line that would otherwise exceed a limit. > As a simplistic example, let as define an algorithm which only allows > breaks at spaces and after hyphens. I don't see the relevance of describing a particular algorithm here. > You can, of course, extrapolate this to great complexity, and it will still > satisfy the requirements of 'normal' line breaking. Maybe under _some_ definition of "normal". Note that this would mean that the algorithm would not split a 2000 characters long URL if it does not contain a hyphen. That is, it would not e.g. apply the UAX 14 rule that permits a split after a solidus (slash, "/"). Actually, unless I have missed something in UAX 14, they (and thus your proposed "normal") rules do _not_ always permit a line break at (or, technically, after) a space character. For example, "it's in directory /usr/spool" must not, under UAX 14, be split as it's in directory /usr/spool since a break before "/" is never permitted, but the following split would allowed: it's in directory / usr/spool Thus your example algorithm would presumably not qualify as UAX 14 conformant. > However, it can't be > used for 'strict' line breaking because it allows breaking after a hyphen, > which 'strict' does not. Whether a hyphen (or hyphen-minus) is a permissible line breaking point is an important decision. It was a bad move to make some UAs treat it permissible by default, and I don't think we should take great pains to retrofit things into such misbehavior. But it's probably a common enough need to have a value for in the relevant CSS property. And for obvious reasons, using such a value, i.e. asking a UA to break after a hyphen when needed, should not open the Pandora's box of UAX 14 rules. On the other hand, I presume that it would be relatively simple, both in terms of specifications and in actual implementations, to allow values that involve lists (sets) of characters, enumerating the characters after which a line break is permitted. In typical cases, when you have a long string that contains special characters, there is a fairly limited set of characters that are really suitable break points. Such a value should probably involve the principle that no word (a string of non-whitespace characters separated by whitespace characters) shall be split so that only one or two characters are left at the end of a line or at the start of a line. This would disallow e.g. breaking "-a" into "-" and "a". It's not just a quality of implementation issue. Who would dare to invoke a line breaking method if it is allowed to result in such splits? > Do you still disagree that 'normal' should be the default? Of course. I haven't seen a single argument _in favor of_ making it the default. Far from being anything intuitively normal, your proposed "normal" builds upon a complicated and artificial set of rules, in a rather obscure way that does not even say whether the rules are to be applied or not in practice. Even more importantly, it would seriously deviate from established practice and both authors' and users' expectations and would even distort information. People who have authored documents have had no reason to expect that UAs would some day start treating a string like "myfiles/foo.txt" as something that can be split to "myfiles/" and "foo.txt" (on different lines) by a UA. I would appreciate the possibility of having, say, long URLs split according to some rules, in case I need to enter URLs into my documents as text (e.g., when discussing URLs in the document document). But I would prefer knowing the rules then, and having them a few magnitudes simpler than UAX 14, and having them enabled on my command only. If I know that line breaks may occur, I can take the suitable precautions like using "<" and ">" delimiters, or something. The bulk of existing Web pages have _not_ been authored with such precautions. And it would not be adequate if authors need to consider all text as potential victims of "normal" line breaking. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Sunday, 4 May 2003 07:35:15 UTC