W3C home > Mailing lists > Public > www-style@w3.org > May 2014

Re: [css-text-4] feedback on hyphenation

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Tue, 20 May 2014 04:42:33 +0000
To: Håkan Save Hansson <hakan.hansson@edison.se>
CC: fantasai <fantasai.lists@inkedblade.net>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <8D9EFCB7-2E1C-4CBE-BF31-9211AEF21CE8@gluesoft.co.jp>
Changed the subject tag to css-text-4, since further controls beyond none|manual|auto are supposed to be in level 4 or higher.

Even for level 4 or higher, I don’t think your suggestion is the good way to go as it solves only a few from examples in Table 1[1]. The paper says OpenOffice does this automatically, so my first suggestion is to file bugs against browsers to pursue better automatic hyphenations.

If you still have use cases to specify this word is “food thief” and that word is “carpet thief”, it looks to me that it’s a semantic issue since you don’t want to change the meaning of words when styles were changed.

[1] https://www.tug.org/TUGboat/tb27-1/tb86nemeth.pdf

/koji

On May 12, 2014, at 16:43, Håkan Save Hansson <hakan.hansson@edison.se> wrote:

> Hi fantasai,
> 
> Regarding your answer to my second suggestion (if you are referring to James Clarks first answer):
> 
> The problem is that the hyphenation system in itself can't decide how to change the spelling, without any "dictionary"   functionality. It can't know if I meant "mat-tjuv" ("food thief" in Swedish) or "matt-tjuv" ("carpet thief") when I wrote "mat&shy;tjuv". So there has to be a way to tell the hyphenation system that.
> 
> See also this message(s): http://lists.w3.org/Archives/Public/www-style/2014Feb/0786.html
> 
> -- Håkan
> 
> 
> 
> -----Ursprungligt meddelande-----
> Från: fantasai [mailto:fantasai.lists@inkedblade.net] 
> Skickat: den 9 maj 2014 02:55
> Till: Håkan Save Hansson; www-style@w3.org
> Ämne: Re: [css-text] feedback on hyphenation
> 
> On 02/12/2014 06:17 AM, Håkan Save Hansson wrote:
>> Hi,
>> 
>> Please consider additional sub properties to "hyphens:auto" for
>> more control over the hyphenation. As it is (tested in FF) the
>> auto-hyphenation is too "strong", meaning that any word that can
>> be hyphenated will be, regardless the suitability and the will
>> of the page editor.
>> 
>> Examples:
>> 
>> hyphens-[Don't hyphenate words shorter than]: 10
>> 
>> hyphens-[Minimum count of characters before hyphen]: 3
>> 
>> hyphens-[Minimum count of characters after hyphen]: 3
>> 
>> hyphens-[Minimum count of characters after hyphen on the last paragraph line]: 6
>> 
>> ..and a bit more advanced..
>> 
>> hyphens-[Maximum count of subsequent lines with hyphens]: 2
>> 
>> ..and so on (these were just from the top of my head).
> 
> These will be addressed in Level 4, as mentioned.
> So deferred for now, but definitely on the list
> of things we will work on once Level 3 is stabilized!
> 
>> Second suggestion:
>> 
>> There are some language specific special cases with hyphenation. In Swedish for instance, if you write the two words "matta"
>> (carpet)and "tjuv" (thief) as one you write it as "mattjuv", with two t letters. This should hyphenate into "matt-tjuv", with
>> three t letters. This is not a hyphenation rule, but rather a type rule: when you write two words as one, there may never be
>> more than two of the same letters where then two words concatenate.
>> 
>> If you want to use a manual soft hyphen (&shy;) for such a word you're in trouble. My suggestion is that when you write
>> "matt&shy;tjuv" in text and it is displayed without hyphenation, it respects this rule and suppresses one of the three letters
>> t. I don't know if such a rule could have a negative effect on texts in other languages, but I don't know of any language
>> allowing three subsequent letters. If needed maybe it could be controlled in CSS something like this:
>> 
>> hyphens-[Maximum count of same letter before and after hyphen]: 2
>> 
>> One could argue that this is not a CSS thing, rather up to the browser to handle (maybe with different behavior for different
>> languages). Well, in my opinion it is certainly related to the rules how the soft hyphen (&shy;) should be handled and without
>> a standardno browser vendor would probably even think of implementing this behavior. Furthermore it must be handled, and
>> handled in the same way, by all (modern) browsers before one can start using it on webpages. You dont want any "matttjuv" with
>> three letters t out there (can't he even spell?). One should be able to check the support at for instance caniuse.com. This
>> speaksin favor ofa CSS property like the one above.
> 
> I think James Clark's answer is the right one here:
>   http://lists.w3.org/Archives/Public/www-style/2014Feb/0739.html
> 
> This would make an excellent test case.
> 
> Let me know if that addresses your concern~
> 
> ~fantasai
> 
> 
> 
Received on Tuesday, 20 May 2014 04:43:14 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:51:27 UTC