W3C home > Mailing lists > Public > www-style@w3.org > April 2012

Re: [css3-text] scoping line break controls, characters that disappear at the end of lines

From: Mark Davis ☕ <mark@macchiato.com>
Date: Sun, 1 Apr 2012 19:07:05 -0700
Message-ID: <CAJ2xs_G-8X1axopnXMV8r+RWRiAMsr7iOmrNp6egef4q4-Xa5w@mail.gmail.com>
To: Ambrose LI <ambrose.li@gmail.com>
Cc: Koji Ishii <kojiishi@gluesoft.co.jp>, "www-style@w3.org" <www-style@w3.org>, WWW International <www-international@w3.org>, asmus@unicode.org
I don't understand; nowhere did I talk about deleting characters.

------------------------------
Mark <https://plus.google.com/114199149796022210033>
*
*
*— Il meglio è l’inimico del bene —*
**



On Sun, Apr 1, 2012 at 18:50, Ambrose LI <ambrose.li@gmail.com> wrote:

> Please see my comments about the honorific usage of the fullwidth space in
> Chinese. In short, if Chinese usage has any importance at all, then yes,
> fullwidth spaces must be kept as-is even at the end or beginning of the
> line. Or (perhaps this might be the preferable case) there should be some
> kind of provision to allow this to happen.
>
> In Chinese, deleting fullwidth spaces can, depending on the context,
> change the *semantics* of the text.
>
>
> 2012/4/1 Mark Davis ☕ <mark@macchiato.com>
>
>> I tend to disagree, if I understand correctly what you mean by "B. If it
>> occurs at the end of a line, does it take up space?"
>>
>> Consider the following case.
>>
>>    1. The text is "abcd<space1><space2><space3>efg"
>>    2. "abcd" fit within the margins, but
>>    3. "abcd<space1><space2><space3>e" does not.
>>
>> I would expect to see the following displayed:
>>
>> abcd
>> efg
>>
>> and not
>>
>> abcd
>>   efg
>>
>> That's the case:
>>
>>    1. even if some of the spaces where fixed width (basically, no matter
>>    which of the Unicode whitespaces they were, except for the Glue
>>    characters NBSP and ZWNBSP).
>>    2. even if the margins were just barely wider than "abcd", so that
>>    the width up through just before the "e" was too wide to fit between the
>>    margins.
>>
>> I would characterize this as "not taking up space at the end of the
>> line"; that is, no matter how many spaces at the end of line, and no matter
>> which they are (excepting glue), you wouldn't expect to see them wrap to
>> the start of the next line. So if that is what Word/IE9 are doing, it looks
>> to me that they are doing the right thing.
>>
>> (That doesn't stand in the way of being able to select/copy whitespace
>> characters that are after "abcd", of course.)
>>
>> ------------------------------
>> Mark <https://plus.google.com/114199149796022210033>
>> *
>> *
>> *— Il meglio è l’inimico del bene —*
>> **
>>
>>
>>
>> On Sat, Mar 31, 2012 at 13:10, Koji Ishii <kojiishi@gluesoft.co.jp>wrote:
>>
>>> I asked this question for ideographic spaces at public-html-ig-jp@w3.orgin January without good conclusion at that point. I then had some
>>> discussion with fantasai, investigated a little more, and came into
>>> diffident conclusion than before.
>>>
>>> In short, I support the current spec--keep around all those fixed-width
>>> spaces.
>>>
>>> Long version: fantasai helped me to make the question simpler:
>>> A. If it occurs at the beginning of a line, does it take up space?
>>> B. If it occurs at the end of a line, does it take up space?
>>> C. If there is more than one together, are they kept together, or can we
>>> break between them?
>>>
>>> By eliminating logically incorrect combinations and incorporating
>>> opinions from Japan, we have 3 options:
>>> 1. YES on the beginning, NO on the end, and keep consecutive spaces
>>> together.
>>> 2. YES on the beginning, YES on the end of line, and allow break between
>>> them.
>>> 3. Variation of 1; allow only one ideographic space at the end, and
>>> ignore the rest.
>>>
>>> MS Word behaves #1. Most traditional Japanese word processors in
>>> 1980/90s behaved #2. #3 is from JLTF, where he likes Word's behavior except
>>> that an ideographic space after an exclamation or question mark should be
>>> honored.
>>>
>>> I quickly looked at current behaviors[1]:
>>> MS Word: #1
>>> Adobe InDesign: #2
>>> IE9: #1
>>> FF11: Neither. Breaks look like IE, but the last two are different.
>>> Justification behavior is also different.
>>> Chrome18/Safari5: #2
>>>
>>> MS Word took #1 because in 1990s, many Japanese authors used ideographic
>>> spaces and ASCII spaces mixed without understanding so. Oftentimes they do
>>> so intentionally assuming two ASCII spaces are equivalent to one
>>> ideographic space, because it was so in most traditional CUI-based
>>> software. To handle two ASCII spaces and one ideographic space in the same
>>> way, and also to support Latin typography, #1 was the best choice.
>>>
>>> Today, in HTML world, I don't think Japanese authors have such
>>> requirements, so there's no big motivation to take the #1 for CSS.
>>>
>>> The point JLTF made--an ideographic space after exclamation/question
>>> marks--makes sense, but it's too special case once we took #1, so Word gave
>>> up implementing it. But it's free of cost if we go with #2.
>>>
>>> Give this, given InDesign taking option 2, and given all browsers
>>> behaving differently today, I think option 2 makes the most sense.
>>>
>>> Note that this is filed as CSS-ISSUE-220[2].
>>>
>>> [1] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0058.html
>>> [2] http://www.w3.org/Style/CSS/Tracker/issues/220
>>>
>>> Regards,
>>> Koji
>>>
>>> -----Original Message-----
>>> From: fantasai [mailto:fantasai.lists@inkedblade.net]
>>> Sent: Tuesday, January 10, 2012 10:37 AM
>>> To: www-style@w3.org; 'WWW International'
>>> Subject: [css3-text] scoping line break controls, characters that
>>> disappear at the end of lines
>>>
>>> In 2008 roc outlined some principles for how line breaking controls
>>> (i.e. 'white-space', at the time) are scoped to line-breaking opportunities:
>>>
>>> In <http://lists.w3.org/Archives/Public/www-style/2008Dec/0043.html>
>>> Robert O'Callahan wrote:
>>> >
>>> > 1) Break opportunities induced by white space are entirely governed by
>>> the
>>> >    value of the 'white-space' property on the enclosing element. So,
>>> spaces
>>> >    that are white-space:nowrap never create break opportunities.
>>> > 2) When a break opportunity exists between two non-white-space
>>> >    characters, e.g. between two Kanji characters, we consult the value
>>> of
>>> >    'white-space' for the nearest common ancestor element of the two
>>> characters
>>> >    to decide if the break is allowed.
>>>
>>> I'm trying to encode this into the spec. My question is, are spaces
>>> (U+0020) the only characters that fall into category #1? What about the
>>> other characters in General Category Zs?
>>>   http://www.fileformat.info/info/unicode/category/Zs/list.htm
>>>
>>> In particular, U+1680 is, like U+0020, expected to disappear at the end
>>> of a line.
>>>
>>> Which brings up another issue: which characters should disappear at the
>>> end of a line? Right now we keep around all those fixed-width spaces.
>>>
>>> ~fantasai
>>>
>>>
>>>
>>
>
>
> --
> cheers,
> -ambrose <http://gniw.ca>
>
Received on Monday, 2 April 2012 02:07:29 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:52 GMT