W3C home > Mailing lists > Public > www-style@w3.org > May 2011

Re: [css3-text] Comments on hyphenation

From: Jonathan Kew <jonathan@jfkew.plus.com>
Date: Mon, 16 May 2011 17:02:53 +0100
Cc: robert@ocallahan.org, Mathias Nater <mathiasnater@gmail.com>, www-style@w3.org
Message-Id: <2590D2B2-EBB0-49A3-9C42-8B0C013B3583@jfkew.plus.com>
To: Brad Kemper <brad.kemper@gmail.com>
On 16 May 2011, at 16:28, Brad Kemper wrote:

> 
> On May 16, 2011, at 5:56 AM, Robert O'Callahan wrote:
> 
>> I'm not too excited about implementing hyphenate-resource. It seems unlikely to me that a significant number of Web developers will bother developing and deploying their own hyphenation dictionaries.
> 
> I think it would be very useful for authors to be able to include their own supplemental hyphenation dictionaries, containing words they use in their industry, brand names, product names, jargon, etc. that wouldn't be in any built-in resource or algorithm. Is that what we are talking about here?

I'd consider this an application for a hyphenation-exceptions list, which would be a different kind of resource from the main hyphenation patterns used by TeX- or libhyphen-style algorithms.

I don't think it's practical to combine author-provided TeX-style hyphenation patterns "on the fly" with the browser's built-in patterns, nor is it realistic to expect authors to know how to specify such patterns and understand their interaction with the standard ones. An "exceptions list" that overrides the built-in hyphenation for specific words only is simple, understandable, and adequate for this need.

(Note that in many cases, brand names, product names, jargon, etc., will be hyphenated satisfactorily by the standard resources, as these are *not* simple word lists but rather collections of "patterns" derived from the syllable structures and spelling rules of the language. So as long as names, jargon, etc., conform to the typical word patterns of the language - which they usually do, otherwise people find them to be unpronounceable! - the hyphenation rules usually work quite well even on newly-coined words.)

For more background, see Appendix H of The TeXbook[1], and Frank Liang's thesis[2] on the topic.

JK

[1] Donald E. Knuth. _The TeXbook_ (Reading, Massachusetts: Addison-Wesley, 1984), x+483pp. ISBN 0-201-13448-9. Also published as _Computers & Typesetting, Volume A: The TeXbook_ (Reading, Massachusetts: Addison-Wesley, 1984). ISBN 0-201-13447-0.
[2] http://tug.org/docs/liang/liang-thesis.pdf
Received on Monday, 16 May 2011 16:03:29 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:40 GMT