W3C home > Mailing lists > Public > www-style@w3.org > March 2012

Re: [css4-text] A non-inherited property to control behavior of whitespace-only child boxes

From: Kang-Hao (Kenny) Lu <kennyluck@csail.mit.edu>
Date: Fri, 30 Mar 2012 04:08:22 +0800
Message-ID: <4F74C136.9020505@csail.mit.edu>
To: Simon Sapin <simon.sapin@kozea.fr>
CC: WWW Style <www-style@w3.org>
(12/03/29 20:53), Simon Sapin wrote:
> Le 29/03/2012 13:27, Kang-Hao (Kenny) Lu a écrit :
>> 'text-space-collapse: discard;' as it is
>> currently defined in CSS4 Text[2] doesn't work because it turns the
>> whitespace into a zero width non-joiner (U+200C) (why, by the way?)
> I think that U+200C here represents a line break opportunity.

I don't think so. According to UAX#14, U+200C is ignored for the purpose
of line breaking. If we want a charter that represents a line break
opportunity, we probably want U+200B ZERO WIDTH SPACE, or perhaps we can
just say the spaces are dropped but the line break opportunities remain?

There are two use case of 'text-space-collapse: discard;' as far as I
can imagine:

1. In a paragraph of Chinese text, 'text-space-collapse: discard;' can
be used to drop extra spaces caused by text editors which think you can
randomly insert spaces at an end of line in HTML.

2. I often write pages that mix English and Chinese. As no browsers have
implement 'text-space' at the moment, I have to manually insert lots of
non-semantic whitespaces between ideographs and non-ideographic letters
to make the text more readable. Once browsers implement 'text-space:
ideograph-alpha;', I should be able to combine 'text-space-collapse:
discard' and some values of 'text-space' so that the spaces are dropped
and then regenerated according to some CJK layout rules that I am not
familiar with.

(It's not clear to me if this is really a nice use case because if the
English text contains whitespaces, this would break. Fortunately that
doesn't happen much in pages I write.)

For 1. inserting U+200C would suppress Chinese ligatures the could
happen if there were no mistakenly inserted space, but Chinese ligatures
are too rare to matter. For 2. inserting U+200C doesn't matter because
there's probably no ligatures that are formed by putting ideographs and
non-ideographs together.

What are the other use cases? And how does U+200C matter with those?

Received on Thursday, 29 March 2012 20:08:51 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 16:28:40 UTC