Feedback on hyphenation properties from Simon Fraser on 2010-08-04 (www-style@w3.org from August 2010)

From: Simon Fraser <smfr@me.com>
Date: Wed, 04 Aug 2010 13:49:11 -0700
To: "www-style@w3.org list" <www-style@w3.org>
Message-id: <EE8E0BC5-56D3-4748-A5ED-34F7C5926F3D@me.com>
We have reviewed the hyphenation-related properties in the GCPM module
<http://www.w3.org/TR/css3-gcpm/> and have the following feedback.

As noted in the Introduction, hyphenation should apply to media
other than "print", so the hyphenation-related properties should
move out of GCPM.

Property names
--------------

We are not keen on "hyphens" as a property name. This doesn't match
other CSS property names which are mostly descriptive. We suggest
"hyphenation" or "hyphenate" instead. Most word processing and
desktop publishing programs usually refer to this behavior as "hyphenation".

One thing to bear in mind is that if we want a shorthand property
in future, we may wish to reserve "hyphenation" or "hyphenate" for the
shorthand, and use "hyphenation-mode"/"hyphenate-mode" for the longhand.

Another consideration is whether hyphenation should be controlled by
a new value for the "word-break" property.

The property names "hyphenate-before" and "hyphenate-after" don't convey
their purpose very well. The naive reader may assume that they are used
to specify characters before/after which splitting is allowed.
They are really "keep at least N characters before/after the
hyphen", which suggest they should have "min" in their names.
Unfortunately no succinct alternatives spring to mind.

Do we really need both "hyphenate-before" and "hyphenate-after" properties,
or would a single "hyphenation-min-fragment-length" property suffice?

"hyphenate-lines" also doesn't convey its purpose very well. It's about
the maximum number of consecutive hyphenated lines. It's also odd to
have a "no-limit" value, rather than choosing a property name which
makes sense with a value of "none".

Finally "hyphenate-character" is odd in that the value takes a string,
not just a single character.

Hyphenation resources
---------------------

We think the "hyphenate-resource" property is problematic for two reasons.

First, the dictionary format is unspecified and there is no "type" parameter
for the resource, so there's no information the UA can use to determine
the format. This is especially problematic if the UA relies on some
underlying infrastructure for word breaking, and needs to pass the resource
down to this infrastructure.

Secondly, simply supplying a list of resources to be checked in order
is problematic, because it may result in in appropriate hyphenation.
If no hyphenation opportunities are found for a given word in a given
language by consulting the first resource, then the algorithm suggests
checking the second resource, which may return a hyphenation opportunity.
However, it may do so for the wrong language.

Finally we think that doing language-sensitive hyphenation is hard
because most web content does not have the appropriate "lang" attributes.
We'd like to suggest a property that permits language-sensitive hyphenation,
namely "hyphenation-locale" (or "hyphenate-locale"), that an author can use
to inform the UA about what locale should be used for hyphenation:

hyphenation-locale: auto | string
where the string is a locale identifier.

If not auto, the value would override the language derived from any present
"lang" attributes.

Simon
Received on Wednesday, 4 August 2010 20:49:45 UTC