W3C home > Mailing lists > Public > www-font@w3.org > July to September 2011

Re: Bug? "Hyphen" character rendered in place of "soft hyphen" in most browsers except Safari and Opera

From: Jonathan Kew <jonathan@jfkew.plus.com>
Date: Sat, 9 Jul 2011 20:59:31 +0100
Cc: www-font <www-font@w3.org>
Message-Id: <DB7A1E11-6FF3-46E9-A491-3B2293E7EF5F@jfkew.plus.com>
To: list.adam@twardoch.com
On 9 Jul 2011, at 17:21, Adam Twardoch (List) wrote:

> Dear www-font list members,
> 
> My colleague type designer Lukasz Dziedzic and myself discovered today a
> rather irritating problem. It appears that the major web browsers
> (Firefox, Chrome, IE) do not correctly render the "soft hyphen"
> character when hyphenation is used. When soft hyphens (&shy; or U+00AD)
> is inserted into the HTML code, those browsers incorrectly render the
> font's "hyphen" character (U+002D) in the place of the soft hyphens,
> instead of rendering the font's "soft hyphen" character (U+00AD).
> 
> This has not been obviously visible so far because in most fonts, the
> glyphs for U+00AD and U+002D are identical, or the fonts have used the
> same glyph for both codepoints. However, we've made a test font in which
> the glyphs for these characters are visibly different, and discovered
> that Firefox, Chrome and IE misbehave. Safari and (to some extent) Opera
> perform correctly, on the other hand.
> 
> We've documented the case extensively, along with the sample font and
> the screenshots, and additional explanations. Please kindly take a look at:
> http://www.twardoch.com/webfonts/2011-07-softhyphenbug/softhyphenbug.html
> 
> I don't know what the best way is to report the bugs to three different
> browser vendors (Mozilla, Google and Microsoft), so I thought this list
> might be the best place to post this information. Please kindly take it
> up and forward the problem to the appropriate channels within each
> browser vendor.

According to UAX #14,

<quote src="http://unicode.org/reports/tr14/#SoftHyphen">
Unlike U+2010 hyphen, which always has a visible rendition, the character U+00AD soft hyphen (shy) is an invisible format character that merely indicates a preferred intraword line break position. If the line is broken at that point, then whatever mechanism is appropriate for intraword line breaks should be invoked, just as if the line break had been triggered by another hyphenation mechanism, such as a dictionary lookup. Depending on the language and the word, that may produce different visible results, for example:

	 Simply inserting a hyphen glyph
	 Inserting a hyphen glyph and changing spelling in the divided word parts
	 Not showing any visible change and simply breaking at that point
	 Inserting a hyphen glyph at the beginning of the new line
</quote>

it sounds as though U+00AD is not expected to have any visible representation (except perhaps in a special "show invisible format characters" mode, such as might be used to make ZWJ/ZWNJ/CGJ/WJ/LRO/RLO/PDF/etc visible, too). I think when browsers perform hyphenation, they should be expected to render a "normal" hyphen character (either U+2010 or U+002D could be reasonable choices), not to expect fonts to provide an appropriate glyph at U+00AD.

JK
Received on Saturday, 9 July 2011 20:00:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:37:36 UTC