- From: Mark Davis ☕ <mark@macchiato.com>
- Date: Sun, 1 Apr 2012 19:07:05 -0700
- To: Ambrose LI <ambrose.li@gmail.com>
- Cc: Koji Ishii <kojiishi@gluesoft.co.jp>, "www-style@w3.org" <www-style@w3.org>, WWW International <www-international@w3.org>, asmus@unicode.org
- Message-ID: <CAJ2xs_G-8X1axopnXMV8r+RWRiAMsr7iOmrNp6egef4q4-Xa5w@mail.gmail.com>
I don't understand; nowhere did I talk about deleting characters. ------------------------------ Mark <https://plus.google.com/114199149796022210033> * * *— Il meglio è l’inimico del bene —* ** On Sun, Apr 1, 2012 at 18:50, Ambrose LI <ambrose.li@gmail.com> wrote: > Please see my comments about the honorific usage of the fullwidth space in > Chinese. In short, if Chinese usage has any importance at all, then yes, > fullwidth spaces must be kept as-is even at the end or beginning of the > line. Or (perhaps this might be the preferable case) there should be some > kind of provision to allow this to happen. > > In Chinese, deleting fullwidth spaces can, depending on the context, > change the *semantics* of the text. > > > 2012/4/1 Mark Davis ☕ <mark@macchiato.com> > >> I tend to disagree, if I understand correctly what you mean by "B. If it >> occurs at the end of a line, does it take up space?" >> >> Consider the following case. >> >> 1. The text is "abcd<space1><space2><space3>efg" >> 2. "abcd" fit within the margins, but >> 3. "abcd<space1><space2><space3>e" does not. >> >> I would expect to see the following displayed: >> >> abcd >> efg >> >> and not >> >> abcd >> efg >> >> That's the case: >> >> 1. even if some of the spaces where fixed width (basically, no matter >> which of the Unicode whitespaces they were, except for the Glue >> characters NBSP and ZWNBSP). >> 2. even if the margins were just barely wider than "abcd", so that >> the width up through just before the "e" was too wide to fit between the >> margins. >> >> I would characterize this as "not taking up space at the end of the >> line"; that is, no matter how many spaces at the end of line, and no matter >> which they are (excepting glue), you wouldn't expect to see them wrap to >> the start of the next line. So if that is what Word/IE9 are doing, it looks >> to me that they are doing the right thing. >> >> (That doesn't stand in the way of being able to select/copy whitespace >> characters that are after "abcd", of course.) >> >> ------------------------------ >> Mark <https://plus.google.com/114199149796022210033> >> * >> * >> *— Il meglio è l’inimico del bene —* >> ** >> >> >> >> On Sat, Mar 31, 2012 at 13:10, Koji Ishii <kojiishi@gluesoft.co.jp>wrote: >> >>> I asked this question for ideographic spaces at public-html-ig-jp@w3.orgin January without good conclusion at that point. I then had some >>> discussion with fantasai, investigated a little more, and came into >>> diffident conclusion than before. >>> >>> In short, I support the current spec--keep around all those fixed-width >>> spaces. >>> >>> Long version: fantasai helped me to make the question simpler: >>> A. If it occurs at the beginning of a line, does it take up space? >>> B. If it occurs at the end of a line, does it take up space? >>> C. If there is more than one together, are they kept together, or can we >>> break between them? >>> >>> By eliminating logically incorrect combinations and incorporating >>> opinions from Japan, we have 3 options: >>> 1. YES on the beginning, NO on the end, and keep consecutive spaces >>> together. >>> 2. YES on the beginning, YES on the end of line, and allow break between >>> them. >>> 3. Variation of 1; allow only one ideographic space at the end, and >>> ignore the rest. >>> >>> MS Word behaves #1. Most traditional Japanese word processors in >>> 1980/90s behaved #2. #3 is from JLTF, where he likes Word's behavior except >>> that an ideographic space after an exclamation or question mark should be >>> honored. >>> >>> I quickly looked at current behaviors[1]: >>> MS Word: #1 >>> Adobe InDesign: #2 >>> IE9: #1 >>> FF11: Neither. Breaks look like IE, but the last two are different. >>> Justification behavior is also different. >>> Chrome18/Safari5: #2 >>> >>> MS Word took #1 because in 1990s, many Japanese authors used ideographic >>> spaces and ASCII spaces mixed without understanding so. Oftentimes they do >>> so intentionally assuming two ASCII spaces are equivalent to one >>> ideographic space, because it was so in most traditional CUI-based >>> software. To handle two ASCII spaces and one ideographic space in the same >>> way, and also to support Latin typography, #1 was the best choice. >>> >>> Today, in HTML world, I don't think Japanese authors have such >>> requirements, so there's no big motivation to take the #1 for CSS. >>> >>> The point JLTF made--an ideographic space after exclamation/question >>> marks--makes sense, but it's too special case once we took #1, so Word gave >>> up implementing it. But it's free of cost if we go with #2. >>> >>> Give this, given InDesign taking option 2, and given all browsers >>> behaving differently today, I think option 2 makes the most sense. >>> >>> Note that this is filed as CSS-ISSUE-220[2]. >>> >>> [1] http://lists.w3.org/Archives/Public/www-archive/2012Mar/0058.html >>> [2] http://www.w3.org/Style/CSS/Tracker/issues/220 >>> >>> Regards, >>> Koji >>> >>> -----Original Message----- >>> From: fantasai [mailto:fantasai.lists@inkedblade.net] >>> Sent: Tuesday, January 10, 2012 10:37 AM >>> To: www-style@w3.org; 'WWW International' >>> Subject: [css3-text] scoping line break controls, characters that >>> disappear at the end of lines >>> >>> In 2008 roc outlined some principles for how line breaking controls >>> (i.e. 'white-space', at the time) are scoped to line-breaking opportunities: >>> >>> In <http://lists.w3.org/Archives/Public/www-style/2008Dec/0043.html> >>> Robert O'Callahan wrote: >>> > >>> > 1) Break opportunities induced by white space are entirely governed by >>> the >>> > value of the 'white-space' property on the enclosing element. So, >>> spaces >>> > that are white-space:nowrap never create break opportunities. >>> > 2) When a break opportunity exists between two non-white-space >>> > characters, e.g. between two Kanji characters, we consult the value >>> of >>> > 'white-space' for the nearest common ancestor element of the two >>> characters >>> > to decide if the break is allowed. >>> >>> I'm trying to encode this into the spec. My question is, are spaces >>> (U+0020) the only characters that fall into category #1? What about the >>> other characters in General Category Zs? >>> http://www.fileformat.info/info/unicode/category/Zs/list.htm >>> >>> In particular, U+1680 is, like U+0020, expected to disappear at the end >>> of a line. >>> >>> Which brings up another issue: which characters should disappear at the >>> end of a line? Right now we keep around all those fixed-width spaces. >>> >>> ~fantasai >>> >>> >>> >> > > > -- > cheers, > -ambrose <http://gniw.ca> >
Received on Monday, 2 April 2012 02:07:29 UTC