W3C home > Mailing lists > Public > www-style@w3.org > April 2012

Re: [css3-text] scoping line break controls, characters that disappear at the end of lines

From: Ambrose LI <ambrose.li@gmail.com>
Date: Sun, 1 Apr 2012 11:29:51 -0400
Message-ID: <CADJvFOXtjk6q3MHGTQMY7J9NF1vCAUr=r2+fq3=SwNXTVfaUhA@mail.gmail.com>
To: Koji Ishii <kojiishi@gluesoft.co.jp>
Cc: "www-style@w3.org" <www-style@w3.org>, WWW International <www-international@w3.org>
For Chinese, I think it might be useful to think of this as a “why”
question.

In Chinese, the ideographic space can be used for honorific purposes. This
is a bit old fashioned, but this is still in use in certain locales in
certain contexts such as formal letters. So this whether ideographic spaces
should be kept is sometimes (but not always) a semantic decision.

InDesign’s behaviour probably stemmed from having considered the Chinese
usage. (Or at least I hoped so.)


2012/4/1 Koji Ishii <kojiishi@gluesoft.co.jp>

> Apologies for not including the Opera result, Mike Taylor kindly sent me
> one[3].
>
> Opera is #2 too, so that's another good news to prefer #2.
>
> [3] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0059.html
>
> -----Original Message-----
> From: Koji Ishii [mailto:kojiishi@gluesoft.co.jp]
> Sent: Sunday, April 01, 2012 5:10 AM
> To: www-style@w3.org; 'WWW International'
> Subject: RE: [css3-text] scoping line break controls, characters that
> disappear at the end of lines
>
> I asked this question for ideographic spaces at public-html-ig-jp@w3.orgin January without good conclusion at that point. I then had some
> discussion with fantasai, investigated a little more, and came into
> diffident conclusion than before.
>
> In short, I support the current spec--keep around all those fixed-width
> spaces.
>
> Long version: fantasai helped me to make the question simpler:
> A. If it occurs at the beginning of a line, does it take up space?
> B. If it occurs at the end of a line, does it take up space?
> C. If there is more than one together, are they kept together, or can we
> break between them?
>
> By eliminating logically incorrect combinations and incorporating opinions
> from Japan, we have 3 options:
> 1. YES on the beginning, NO on the end, and keep consecutive spaces
> together.
> 2. YES on the beginning, YES on the end of line, and allow break between
> them.
> 3. Variation of 1; allow only one ideographic space at the end, and ignore
> the rest.
>
> MS Word behaves #1. Most traditional Japanese word processors in 1980/90s
> behaved #2. #3 is from JLTF, where he likes Word's behavior except that an
> ideographic space after an exclamation or question mark should be honored.
>
> I quickly looked at current behaviors[1]:
> MS Word: #1
> Adobe InDesign: #2
> IE9: #1
> FF11: Neither. Breaks look like IE, but the last two are different.
> Justification behavior is also different.
> Chrome18/Safari5: #2
>
> MS Word took #1 because in 1990s, many Japanese authors used ideographic
> spaces and ASCII spaces mixed without understanding so. Oftentimes they do
> so intentionally assuming two ASCII spaces are equivalent to one
> ideographic space, because it was so in most traditional CUI-based
> software. To handle two ASCII spaces and one ideographic space in the same
> way, and also to support Latin typography, #1 was the best choice.
>
> Today, in HTML world, I don't think Japanese authors have such
> requirements, so there's no big motivation to take the #1 for CSS.
>
> The point JLTF made--an ideographic space after exclamation/question
> marks--makes sense, but it's too special case once we took #1, so Word gave
> up implementing it. But it's free of cost if we go with #2.
>
> Give this, given InDesign taking option 2, and given all browsers behaving
> differently today, I think option 2 makes the most sense.
>
> Note that this is filed as CSS-ISSUE-220[2].
>
> [1] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0058.html
> [2] http://www.w3.org/Style/CSS/Tracker/issues/220
>
> Regards,
> Koji
>
> -----Original Message-----
> From: fantasai [mailto:fantasai.lists@inkedblade.net]
> Sent: Tuesday, January 10, 2012 10:37 AM
> To: www-style@w3.org; 'WWW International'
> Subject: [css3-text] scoping line break controls, characters that
> disappear at the end of lines
>
> In 2008 roc outlined some principles for how line breaking controls (i.e.
> 'white-space', at the time) are scoped to line-breaking opportunities:
>
> In <http://lists.w3.org/Archives/Public/www-style/2008Dec/0043.html>
> Robert O'Callahan wrote:
> >
> > 1) Break opportunities induced by white space are entirely governed by
> the
> >    value of the 'white-space' property on the enclosing element. So,
> spaces
> >    that are white-space:nowrap never create break opportunities.
> > 2) When a break opportunity exists between two non-white-space
> >    characters, e.g. between two Kanji characters, we consult the value of
> >    'white-space' for the nearest common ancestor element of the two
> characters
> >    to decide if the break is allowed.
>
> I'm trying to encode this into the spec. My question is, are spaces
> (U+0020) the only characters that fall into category #1? What about the
> other characters in General Category Zs?
>   http://www.fileformat.info/info/unicode/category/Zs/list.htm
>
> In particular, U+1680 is, like U+0020, expected to disappear at the end of
> a line.
>
> Which brings up another issue: which characters should disappear at the
> end of a line? Right now we keep around all those fixed-width spaces.
>
> ~fantasai
>
>
>
>


-- 
cheers,
-ambrose <http://gniw.ca>
Received on Sunday, 1 April 2012 15:30:20 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:52 GMT