Re: [css3-text] scoping line break controls, characters that disappear at the end of lines

Hello Ambrose,

On 2012/04/02 0:29, Ambrose LI wrote:
> For Chinese, I think it might be useful to think of this as a “why”
> question.
>
> In Chinese, the ideographic space can be used for honorific purposes. This
> is a bit old fashioned, but this is still in use in certain locales in
> certain contexts such as formal letters. So this whether ideographic spaces
> should be kept is sometimes (but not always) a semantic decision.

Very interesting. Can you tell us where these spaces are used? For 
example around the names of a person being 'honored'? Or throughout the 
text?

Regards,    Martin.

> InDesign’s behaviour probably stemmed from having considered the Chinese
> usage. (Or at least I hoped so.)
>
>
> 2012/4/1 Koji Ishii<kojiishi@gluesoft.co.jp>
>
>> Apologies for not including the Opera result, Mike Taylor kindly sent me
>> one[3].
>>
>> Opera is #2 too, so that's another good news to prefer #2.
>>
>> [3] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0059.html
>>
>> -----Original Message-----
>> From: Koji Ishii [mailto:kojiishi@gluesoft.co.jp]
>> Sent: Sunday, April 01, 2012 5:10 AM
>> To: www-style@w3.org; 'WWW International'
>> Subject: RE: [css3-text] scoping line break controls, characters that
>> disappear at the end of lines
>>
>> I asked this question for ideographic spaces at public-html-ig-jp@w3.orgin January without good conclusion at that point. I then had some
>> discussion with fantasai, investigated a little more, and came into
>> diffident conclusion than before.
>>
>> In short, I support the current spec--keep around all those fixed-width
>> spaces.
>>
>> Long version: fantasai helped me to make the question simpler:
>> A. If it occurs at the beginning of a line, does it take up space?
>> B. If it occurs at the end of a line, does it take up space?
>> C. If there is more than one together, are they kept together, or can we
>> break between them?
>>
>> By eliminating logically incorrect combinations and incorporating opinions
>> from Japan, we have 3 options:
>> 1. YES on the beginning, NO on the end, and keep consecutive spaces
>> together.
>> 2. YES on the beginning, YES on the end of line, and allow break between
>> them.
>> 3. Variation of 1; allow only one ideographic space at the end, and ignore
>> the rest.
>>
>> MS Word behaves #1. Most traditional Japanese word processors in 1980/90s
>> behaved #2. #3 is from JLTF, where he likes Word's behavior except that an
>> ideographic space after an exclamation or question mark should be honored.
>>
>> I quickly looked at current behaviors[1]:
>> MS Word: #1
>> Adobe InDesign: #2
>> IE9: #1
>> FF11: Neither. Breaks look like IE, but the last two are different.
>> Justification behavior is also different.
>> Chrome18/Safari5: #2
>>
>> MS Word took #1 because in 1990s, many Japanese authors used ideographic
>> spaces and ASCII spaces mixed without understanding so. Oftentimes they do
>> so intentionally assuming two ASCII spaces are equivalent to one
>> ideographic space, because it was so in most traditional CUI-based
>> software. To handle two ASCII spaces and one ideographic space in the same
>> way, and also to support Latin typography, #1 was the best choice.
>>
>> Today, in HTML world, I don't think Japanese authors have such
>> requirements, so there's no big motivation to take the #1 for CSS.
>>
>> The point JLTF made--an ideographic space after exclamation/question
>> marks--makes sense, but it's too special case once we took #1, so Word gave
>> up implementing it. But it's free of cost if we go with #2.
>>
>> Give this, given InDesign taking option 2, and given all browsers behaving
>> differently today, I think option 2 makes the most sense.
>>
>> Note that this is filed as CSS-ISSUE-220[2].
>>
>> [1] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0058.html
>> [2] http://www.w3.org/Style/CSS/Tracker/issues/220
>>
>> Regards,
>> Koji
>>
>> -----Original Message-----
>> From: fantasai [mailto:fantasai.lists@inkedblade.net]
>> Sent: Tuesday, January 10, 2012 10:37 AM
>> To: www-style@w3.org; 'WWW International'
>> Subject: [css3-text] scoping line break controls, characters that
>> disappear at the end of lines
>>
>> In 2008 roc outlined some principles for how line breaking controls (i.e.
>> 'white-space', at the time) are scoped to line-breaking opportunities:
>>
>> In<http://lists.w3.org/Archives/Public/www-style/2008Dec/0043.html>
>> Robert O'Callahan wrote:
>>>
>>> 1) Break opportunities induced by white space are entirely governed by
>> the
>>>     value of the 'white-space' property on the enclosing element. So,
>> spaces
>>>     that are white-space:nowrap never create break opportunities.
>>> 2) When a break opportunity exists between two non-white-space
>>>     characters, e.g. between two Kanji characters, we consult the value of
>>>     'white-space' for the nearest common ancestor element of the two
>> characters
>>>     to decide if the break is allowed.
>>
>> I'm trying to encode this into the spec. My question is, are spaces
>> (U+0020) the only characters that fall into category #1? What about the
>> other characters in General Category Zs?
>>    http://www.fileformat.info/info/unicode/category/Zs/list.htm
>>
>> In particular, U+1680 is, like U+0020, expected to disappear at the end of
>> a line.
>>
>> Which brings up another issue: which characters should disappear at the
>> end of a line? Right now we keep around all those fixed-width spaces.
>>
>> ~fantasai
>>
>>
>>
>>
>
>

Received on Monday, 2 April 2012 00:47:57 UTC