W3C home > Mailing lists > Public > whatwg@whatwg.org > January 2007

[whatwg] Hyphenation

From: Håkon Wium Lie <howcome@opera.com>
Date: Thu, 11 Jan 2007 14:49:17 +0100
Message-ID: <17830.16477.894411.961072@gargle.gargle.HOWL>
Also sprach ?istein E. Andersen:

 > > Prince6 (www.princexml.com) supports these properties:
 > > 
 > >   hyphenate: none | auto
 > >   hyphenate-dictionary: none | url(...)
 > >   hyphenate-before: <int>
 > >   hyphenate-after: <int>
 > >   hyphenate-lines: none | <int>
 > >From http://www.princexml.com/howcome/2006/p6/p6demo2.html:
 > > Prince can read the hyphenation format pioneered by TeX and reused by many
 > > other applications. OpenOffice hosts a number of hyphenation dictionaries that
 > > are reusable in Prince6.

 > This is, however, only one part of TeX's hyphenation system. The next level is a
 > hyphenation exception dictionary, a list of fully hyphenated words that would not
 > otherwise be hyphenated correctly. 

Prince doesn't support exception dictionaries. Is it not possible to
encode exceptions in the hyphenation dictionary?

DSSSL has an 'hyphenation-exceptions' property which takes a list of
strings. I'm unsure if it has been implemented, though.


 > In addition to this, hyphenation can be indicated locally. This is needed in order to
 > hyphenate words like rec-ord/re-cord and is the only level that deals with
 > spelling changes.

This can be done by supplying your own dictionary through the
'hyphenate-dictionary' property.

 > There are a few additional caveats. For instance, it is not entirely obvious what
 > should be considered to be a `word' or which characters should be allowed in a
 > `word' (given that only `words' can be hyphenated using this kind of algorithms).
 > TeX uses `category codes' to define letters, and Unicode's character classes
 > give a good approximation, but they cannot be redefined to deal with specific
 > issues. In Italian, for instance, dell'opera should be hyphenated dell'o-
 > pera, but opera should not be hyphenated o-pera. (The particular example may
 > be wrong, but the principle is correct.) Unless the apostrophe is
 > considered to be a `letter' (a constituent of a `word'), correct patterns do not
 > help, as `dell'opera' will not be considered as one unit during hyphenation-point
 > look-up.
 > Another example worth mentioning is that Polish and a few other languages
 > apparently require a hyphenated word like xxx-yyy to be hyphenated xxx-
 > -yyy (with an extra hyphen carried over). A truly flexible system would allow
 > to specify, e.g., which non-letters to treat as part of words and which to give
 > special treatment. (As we all know, TeX hyphenates xxx-yyy as xxx-
 > yyy; in addition, the hyphen prohibits xxx and yyy from being hyphenated,
 > which may or may not be suitable depending on, e.g., column width.)
 > How does Prince deal with these issues?

Prince6 does't try to go beyond Tex.

              H?kon Wium Lie                          CTO ??e??
howcome at opera.com                  http://people.opera.com/howcome
Received on Thursday, 11 January 2007 05:49:17 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:31 UTC