W3C home > Mailing lists > Public > www-style@w3.org > July 2014

Re: Ambiguous hyphenation cases with

From: Kess Vargavind <vargavind@gmail.com>
Date: Tue, 22 Jul 2014 21:17:00 +0000
Message-ID: <CAO2mWL6M43CaFCwFF1Lo39OM5DtGe5Kzq9kv-dkksZZz0WByRw@mail.gmail.com> (sfid-20140722_211705_465493_EC47F696)
To: fantasai <fantasai.lists@inkedblade.net>
Cc: Håkan Save Hansson <hakan.hansson@edison.se>, "www-style@w3.org" <www-style@w3.org>, Unicode <unicode@unicode.org>
There actually is one simple solution that I sometimes use: do not contract
three consecutive same-letter consonants at all! That is, do like Icelandic
and write food thief as <mattjuv> and carpet thief as <matttjuv>. Then
there is no trouble hyphenating.

Yes, this goes against current spelling rules in Swedish, but it works. And
until there is better hyphenation support for corner cases like this
(either at character level or higher) that is how I have ‘solved’ it when
unable to do manual tweaking.

Would it be logical to add a character similar to U+00AD SOFT HYPHEN (shy)
that says “you can break me here, but unless you do please skip the
previous character (however such would be defined in a case like this)”?
Such that <matt[SHY-LIKE-CHAR]tjuv> is either rendered <mattjuv> or broken
up as <matt-tjuv>.


2014-07-22 16:03 GMT+02:00 fantasai <fantasai.lists@inkedblade.net>:

> On 05/12/2014 12:43 AM, Håkan Save Hansson wrote:
>> Hi fantasai,
>> Regarding your answer to my second suggestion (if you are referring
>> to James Clarks first answer):
>> The problem is that the hyphenation system in itself can't decide how
>> to change the spelling, without any "dictionary"   functionality. It
>> can't know if I meant "mat-tjuv" ("food thief" in Swedish) or "matt-tjuv"
>> ("carpet thief") when I wrote "mat&shy;tjuv". So there has to be a way
>> to tell the hyphenation system that.
> Hm. I don't think I have a solution for that problem. :/ Currently you'd
> just have to not hyphenate that word.
> CCing Unicode, in case anyone there has a solution
> Up-reference: http://lists.w3.org/Archives/Public/www-style/2014Feb/0739.
> html
> ~fantasai
> _______________________________________________
> Unicode mailing list
> Unicode@unicode.org
> http://unicode.org/mailman/listinfo/unicode
Received on Wednesday, 23 July 2014 14:23:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 25 March 2022 10:08:44 UTC