- From: Jukka K. Korpela <jukka.k.korpela@kolumbus.fi>
- Date: Sun, 08 Mar 2015 16:42:36 +0200
- To: public-html@w3.org
2015-03-08, 12:38, Andrea Rendine wrote: > Something like a couple years ago, I filed a bug on the WHATWG HTML > Living Standard project, complaining about how a tag was meant to be > rendered. I received a completely unsatisfying answer, which was > basically meant to tell me to wait. Enough waiting now. WHATWG and W3C are two different entities, though in cooperation. If you want to have WHATWG HTML changed, you need to talk to WHATWG, I think. I cannot comment on the content of your proposal, as you don’t quote or cite it. > In the spec, <wbr> is meant to be a line break opportunity. > That’s a good informal description, but rather unsatisfactory as a definition. I think <wbr> should be defined so that its effect is equivalent to the presence of ZERO-WIDTH SPACE U+200B character, This would still require a description of the effect of this character; it should not mean just that a browser may break a line—rather the conditions for doing so should be specified as part of the layout algorithm. But this would be rather straighforward. (In practice, support to <wbr> predates support to U+200B by many years. As a logical description, however, it is natural to base the definition of <wbr> on a standardized character designed for similar use.) > The two examples provided are quite different (a prose with a curious > streamlike formatting, and a code fragment), but this underlines how > different use cases can be. > Both examples are meaningful, but rather special. Somewhat more normal examples would be input/<wbr>output choo-<wbr>choo > Now, when reading the prose I thought to myself that this can be a > convenient way to insert hyphenated word breaks, as most languages > break words with a - mark. > Such a convention is far less universal than you might think. In any case, hyphenation is very different from conditional line break. When Netscape invented <wbr>, they chose a very poor name, since <wbr> isn’t really Word BReak except when used for text in a language that does not use hyphens to indicate line breaks. It’s more like allowed break in a string. > <wbr> would remove the need for a ­ soft hyphen. > String break opportunities and word hyphenation are very different things. The SOFT HYPHEN U+00AD character is meant to be used to interfere with automatic hyphenation (usually to deal with cases that it would hyphenate incorrectly or would fail to hyphenate at all) or, in some cases, to specify hyphenation points in a situation where automatic hyphenation cannot be used. > > An alternative to good ol' U+173 Soft Hyphen is needed because such > character is copypasted along with the text itself, and can be fetched > by data-mining tools, while it isn't strictly part of content. <wbr>, > on the other side, is markup and wouldn't appear where only text is to > be fetched. > The soft hyphen has been with us for many years, about 20 years in HTML I would say, though support to it has been surprisingly limited. Any software that reads HTML data needs to cope with it. Copypaste has many similar problems already. Or, rather, it is not a problem with copypaste but with using the pasted text in a particular environment. It is all OK for plain text to contain soft hyphens. If you paste text in an environment where e.g. soft hyphens or other invisible characters might cause problems, you just need to deal with it. Even if soft hyphens were strictly forbidden now, you would still have zillions of them in existing data. > Now, according to the idea of the spec, an author could style such > element so that its rendering fits to the purpose. For instance, <wbr> > inside <code> could be rendered as zero width space, while <wbr> in > paragraphs could become a soft hyphen. > I cannot see where you got that idea, but in any case, <wbr> is not meant to become a soft hyphen. > Now, in the last 2 years i saw no improvements in the implementation, > by UAs, of the "content" property applied to real (non pseudo) elements. > There are no specifications on such things either, and there has not been work on the issue for many years, as far as I can see. > What is important now is that <wbr> doesn't work on its own. It does its own job well. > It breaks words, right, No, it just declares a permitted line breaking point in a string. > but any empty span element would do the trick. No, empty spans do not do such things; foo<span></span>bar acts as foobar as far as layout is considered, -- Yucca, http://www.cs.tut.fi/~jkorpela/
Received on Sunday, 8 March 2015 14:43:06 UTC