W3C home > Mailing lists > Public > www-international@w3.org > October to December 2008

Re: Ideographic Space, word-spacing, and justification

From: Asmus Freytag <asmusf@ix.netcom.com>
Date: Fri, 31 Oct 2008 18:35:58 -0700
Message-ID: <490BB27E.7020601@ix.netcom.com>
To: Steve Deach <sdeach@adobe.com>
CC: Martin Duerst <duerst@it.aoyama.ac.jp>, "KOBAYASHI Tatsuo(FAMILY Given)" <tlk@kobysh.com>, fantasai <fantasai.lists@inkedblade.net>, WWW International <www-international@w3.org>, Paul Nelson <paulnel@winse.microsoft.com>, Michel Suignard <michel@unicode.org>

On 10/31/2008 3:01 PM, Steve Deach wrote:
> Replies interleaved below.
>
>
> On 2008.10.31 14:30, "Asmus Freytag" <asmusf@ix.netcom.com> wrote:
>
>   
>> On 10/31/2008 1:50 PM, Steve Deach wrote:
>>     
>>> Exactly what I said under WSA/WSR. In some languages, this is used for
>>> emphasis .
>>>       
>> My point was that because they conflict visually, when WSA is used,
>> letterspacing should either not be permitted, should be contraction
>> only, or should be kept below a threshold that prevents it from being
>> confused with WSA.
>>     
>
> No, WSA is the emphasis usage. It is an explicit style override on a span of
> a fixed amount (basically CSS-2.0 letterspacing). It is applied first,
> before any line-breaking decision is made. If the line then requires
> justification, traditional letterspacing/wordspacing (as is in XSL &
> proposed in CSS3) is applied in addition. (see more detail below)
>   
OK, so it looks like you didn't get what I tried to get across:

Readers t h a t expect WSA
used  for e m p h a s i s have
a    h   a    r   d     t   i   m   e
understanding  text  that also
uses letterspacing.

If you can't read the sentence above, let alone tell where the 
*emphasis* is supposed to go, you know what I mean.

To put this in the context of cultural conventions:

English readers are unlikely to encounter WSA used for  e m p h a s i s. 
For such readers, text that is formatted with large amounts of 
letterspacing added to fit narrow columns is (apparently) acceptable 
without any problem, even though in some situations, like narrow 
columns, the type color can vary drastically.

German and other readers who expect that space added between letters 
denotes emphasis will have a harder time reading text that uses (more 
than minimal) expansion of interletter space for justification. To such 
readers, the justified text may appear to be *randomly* emphasized. The 
user can't determine what caused the additional space between the 
letters. He or she can only note that some words appear lighter than 
others because of space between the letters.

Your usage rules below are culturally blind - they merely attempt to 
regulate how a user agent processes different instructions present in 
the formatting. They don't address the issue I'm trying to make here, 
which is that combining WSA and letterspacing is a bad idea, and even 
using any noticable degree of letterspacing in cultural contexts where 
other texts use WSA will be confusing.

"Traditional" electronic publishing has consistently gotten this issue 
wrong. I've encountered many badly typeset documents that suffered from 
having been typeset by software designed for Anglo-Saxon typography. 
(The effect is exacerbated whenever the software doesn't support the 
hyphenation necessary to deal with those German monster compound words, 
or when the defaults are set to disable hyphenation).

A./
>   
>>> There are ³country² differences, ³language² differences, ³script², and
>>> ³wild hare² (random designer-/instance-specific) differences in
>>> everything related to text composition (styling & layout).
>>>       
>> Agreed - it is helpful, though, to amass as much detailed input on known
>> systematic differences in typical text usage conventions. I find that
>> much more helpful than merely saying "watch out - something may depend
>> on something".
>>     
>
> --Sorry, I don't understand the referents for the first sentence. I think
> you are saying that you can't "automate" this processing, which I basically
> agree with because you can't fully determine the aggregat of
> language/country/... combinations in a systematic manner -- Since at least
> one of these parameters is "wild hare" (and perfectly resolving
> reader/designer conflicts in locale is impossible), you need explicit
> options there may be for some of these styling decisions, because even in a
> well-defined locale 2 or more viable choices or the non-standard choice may
> be intended to attract attention (or other designer reason).
>
> --I'll answer the second sentence:
>   In this case, all the adjustments are "additive". For any property that
> has a single value or an optimal value (in a min/opt/max triplet), the
> single/optimal value is applied before making the line-break decision; then
> the min/max values are used to determine a "window" on the break-point; then
> once the break is chosen, the min/max values are used to readjust the
> spacing to make the line justify. (Its been that way in traditional
> publishing systems for 32 years, at least.)
>
>   
>> A./
>>     
>>> (However, 20+ years ago, no one was very careful about those
>>> distinctions; so I think I used [or intended to use] script/language
>>> in the message below to indicate the distinctions were fuzzy. The same
>>> comment applies to ³letter², ²character², vs. ³glyph²; so read my
>>> email using the traditional ³fuzzy² definitions vs. the current ones.)
>>>
>>> On 2008.10.31 12:46, "Asmus Freytag" <asmusf@ix.netcom.com> wrote:
>>>
>>>       
>>>> An aside on "letterspacing":
>>>>
>>>> The use of this is language dependent! (Not just script dependent).
>>>>
>>>> In German, it the use of increased letterspace for e m p h a s i s
>>>> (like this) has traditionally been used with both Fraktur and roman
>>>> style fonts. The practice is apparently still alive and well, because
>>>> you find it use in electronic forums on the web - a rather modern use of
>>>> text. Letterspacing, unless kept below very tight thresholds, is
>>>> therefore c o n f u s i n g to readers expecting it to denote emphasis.
>>>>
>>>> Other Northern European languages may have similar issues, but I don't
>>>> have first hand knowledge of current practices.
>>>>
>>>> A./
>>>>
>>>> On 10/31/2008 12:04 PM, Steve Deach wrote:
>>>>         
>>>>> Every few years this issues comes back up. Unfortunately, I can't
>>>>>           
>>> find the
>>>       
>>>>> rather long treatise I wrote the last time.
>>>>>
>>>>> In general, I agree with Martin, that one should use styling
>>>>>           
>>> properties as a
>>>       
>>>>> replacement for most of the "layout" uses of space characters (just
>>>>>           
>>> as one
>>>       
>>>>> should use tables in place of most uses of tabs). That said, I would
>>>>>           
>>> like to
>>>       
>>>>> briefly summarize the traditional (pre-DTP) handling of spaces and
>>>>>           
>>> spacing,
>>>       
>>>>> and comment on "what I believe" to be the correct handling.
>>>>>
>>>>> Second, I agree that the handling of letterspacing and wordspacing
>>>>>           
>>> varies by
>>>       
>>>>> script and in some cases usage within a script, due to historic/cultural
>>>>> differences in preferences/aesthetics, or specific readability
>>>>>           
>>> requirements
>>>       
>>>>> for the usage, and the aesthetic desires of the designer.
>>>>>
>>>>>
>>>>>
>>>>> This is a partial reconstruction of my prior emails on this topic.
>>>>>
>>>>> My terminology:
>>>>> "Spacing" an adjustment to the distance between 2 glyphs/characters.
>>>>> "Space" a character which has a width but no visible inked
>>>>>           
>>> representation.
>>>       
>>>>> "Letterspacing" an adjustment to the intercharacter spacing used for
>>>>> line justification. [This definintion differs from CSS's.]
>>>>> "Wordspacing" an adjustment to the width of an interword space, also
>>>>> used for line justification.
>>>>> "WhiteSpaceAddition/Reduction ( WSA/ WSR)" a uniform adjustment to
>>>>> intercharacter spacing that is applied for design purposes or
>>>>> emphasis . [This corresponds most closely to the CSS-2.0 definition
>>>>> of letterspacing. Most DTP applications call this "Tracking".]
>>>>> "Tracking" and adjustment to intercharacter spacing which varies by
>>>>> fontsize/pointsize that is used to increase readability when
>>>>> optical sizing is not provided by the font. [This traditional
>>>>> definition differs from that used in most DTP applications.]
>>>>>
>>>>>
>>>>> In setting Roman text:
>>>>> Letterspacing is not generally applied to Arabic (and other
>>>>> connected-letter scripts/languages, nor to connected letter
>>>>>           
>>> ("script") faces
>>>       
>>>>> in Roman-derivative scripts)
>>>>> Letterspacing is not generally applied to ideographic or similar
>>>>> monospaced scripts, nor to monospaced text in Roman-derivative
>>>>>           
>>> environments.
>>>       
>>>>> Traditional applications varied widely in the algorithms used for
>>>>> weighting how much of a justification adjustment was applied to
>>>>>           
>>> wordspacing
>>>       
>>>>> vs to letterspacing. Most modern systems treat them as
>>>>>           
>>> linear-proportional.
>>>       
>>>>> Traditional publishing applications were also at odds over whether the
>>>>> letterspacing adjustment AND the wordspacing adjustment should both be
>>>>> applied to the space/NbSp characters, but most modern systems apply
>>>>>           
>>> both.
>>>       
>>>>> The Unicode NbSp (u+00a0) character should be treated the same as the
>>>>> Unicode Space (u+0020). [In traditional publishing systems, these are
>>>>> variable width in justified lines and fixed width in "aligned",
>>>>>           
>>> tabular, and
>>>       
>>>>> math uses. However, some traditional publishing systems treat all space
>>>>> characters prior to the first non-space in a line as fixed width.]
>>>>> The FigureSpace (u+2007), and PunctuationSpace (u+2008) are treated the
>>>>> same way the corresponding figure '0' and punctuation period/full
>>>>>           
>>> stop would
>>>       
>>>>> be treated in the current layout context (justified vs
>>>>> aligned/tabular/math).
>>>>> Some traditional publishing systems had a quad-space and a
>>>>> justifying-space (sometimes called a 'spaceband' rather than 'justifying
>>>>> space'). Use of the quad-space within justified text would force the
>>>>>           
>>> fixed
>>>       
>>>>> nominal-width of the normal interword space character, disabling
>>>>> justification adjustments. This encoding concept has no analogy in
>>>>>           
>>> Unicode.
>>>       
>>>>> All other space characters {EM-space, EN, EM-quad, EN-quad, 3/EM, 4/EM,
>>>>> 6/EM, Thin, & Hair} are treated as fixed width and are not adjusted for
>>>>> letterspacing nor for wordspacing. (Traditional publishing systems used
>>>>> these for alignment/layout and did not generally apply tracking nor
>>>>>           
>>> WSA/WSR
>>>       
>>>>> either.)
>>>>>
>>>>> Ideographic languages/scripts do not generally use wordspacing or
>>>>> letterspacing to adjust justification; instead they typically use
>>>>>           
>>> rules akin
>>>       
>>>>> to those described in JIS-4051 (latest). This algorithm involves
>>>>>           
>>> trimming
>>>       
>>>>> some characters to half-width, then reinserting 1/2 & 1/4-em spacing
>>>>> adjustments at selected points within the line.
>>>>> Under these rules, Ideographic-space is treated as an ideographic letter
>>>>> [generally fixed-fullwidth, but has some specific additional rules],
>>>>>           
>>> and not
>>>       
>>>>> as a roman variable space.
>>>>> It should be a styling option of whether Roman text embedded in
>>>>> Ideographic text is set using Roman algorithms or Japanese/Chinese
>>>>> algorithms. Depending on the publication and the publisher, Roman
>>>>>           
>>> text may
>>>       
>>>>> be set proportional (using Roman or Asian justification rules),
>>>>>           
>>> halfwidth,
>>>       
>>>>> or fullwidth. (Similarly, they may choose Asian or Roman
>>>>>           
>>> word-breaking and
>>>       
>>>>> hyphenation rules.)
>>>>>
>>>>> I have not covered any specifics in the handling of ancient
>>>>>           
>>> languages that
>>>       
>>>>> are generally only of academic interest; nor the handling of Arabic and
>>>>> Arabic-dervative scripts; nor Indic; nor certain other language-specific
>>>>> differences (such as adjustments to spaces on sentence boundaries in
>>>>>           
>>> some
>>>       
>>>>> uses , nor after certain punctuation characters in French and other
>>>>> languages).
>>>>>
>>>>> I have also not addressed the handling of "hanging punctuation" and
>>>>>           
>>> "hanging
>>>       
>>>>> spaces"; though there are different philosophies/algorithm for handling
>>>>> these across the various script families.
>>>>>
>>>>> -- S.Deach
>>>>> sdeach@adobe.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 2008.10.31 02:43, "Martin Duerst" <duerst@it.aoyama.ac.jp> wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> Hello everybody,
>>>>>>
>>>>>> Just a bit of a wider background on full-width space.
>>>>>>
>>>>>> It should be remembered that in contrast to the usual space (U+0020),
>>>>>> which occurs all over the place in texts in most languages, the
>>>>>> full-width space doesn't occur AT ALL in typical Japanese (or Chinese)
>>>>>> texts. That's why it also barely occurs in the document written
>>>>>> by the Japanese Layout TF, as well as in JIS 4501.
>>>>>>
>>>>>> The full-width space is more used for layout than inside the actual
>>>>>> text. In this respect, what CSS should do is to mainly look at
>>>>>> Japanese typography and try to come up with properties that allow
>>>>>> to get rid of full-width spaces in the text, rather than spending
>>>>>> too much time on how to treat full-width space.
>>>>>>
>>>>>> As a typical example, I guess lead typesetting and also definitely
>>>>>> simple approaches to typesetting on the computer, such as plain
>>>>>> text or old "word-processors" (which were not very much above
>>>>>> plain text in their capabilities) use a full-width space to produce
>>>>>> a start-of-paragraph indent (which is very often one full-width
>>>>>> character wide). CSS should make sure that there is no need to
>>>>>> insert such full-width spaces, because an exact one-full-width-
>>>>>> character start-of-paragraph indent can be produced with an
>>>>>> appropriate CSS property setting.
>>>>>>
>>>>>> Another typical use of full-width space was to center text,
>>>>>> and to insert spaces into text for headlines (to a large
>>>>>> extent a crude backup for increasing text size, which wasn't
>>>>>> possible when technology was limited to one or two bit-mapped
>>>>>> font sizes. In this case, inter-character spacing property(/ies)
>>>>>> may be important for 'facsimile' layouts, but with modern
>>>>>> technology, such layout isn't much used anymore anyway.
>>>>>>
>>>>>> Regards, Martin.
>>>>>>
>>>>>> At 18:31 08/10/30, KOBAYASHI Tatsuo(FAMILY Given) wrote:
>>>>>>
>>>>>>             
>>>>>>> Hi, Erica,
>>>>>>>
>>>>>>> In Japanese Layout, "spacing issue" is one of the most difficult
>>>>>>>               
>>> issues to
>>>       
>>>>>>> treat.
>>>>>>> We intended to carefully eliminate concrete character name like
>>>>>>>               
>>> IDEOGRAPHIC
>>>       
>>>>>>> SPACE(U+3000) and SPACE(U+0002) from our requirement. Rather
>>>>>>>               
>>> introduced
>>>       
>>>>>>> three
>>>>>>> different types of abstract space concepts as follows:
>>>>>>>
>>>>>>> inter character space: usulal 1/2 em fixed space.
>>>>>>> conditional space: 1/2 em fixed space to be inserted or pulled off
>>>>>>>               
>>> between
>>>       
>>>>>>> characters and punctuation marks.
>>>>>>> adjustable space: variable width space, behaves like usual western
>>>>>>>               
>>> variable
>>>       
>>>>>>> space.
>>>>>>>
>>>>>>> Note that, usual Japanese punctuation marks have 1/2 em width in our
>>>>>>> requirement, even if the character name might include "FULLWIDTH ~~~"
>>>>>>>
>>>>>>> Anyway, the disition how to deal with these spaces in CSS
>>>>>>>               
>>> recommendation
>>>       
>>>>>>> and
>>>>>>> in actual implementation is up to your side:-)
>>>>>>>
>>>>>>> regards,
>>>>>>> Tatsuo
>>>>>>>
>>>>>>> 2008/10/30 Steve Deach <<mailto:sdeach@adobe.com>sdeach@adobe.com
>>>>>>>               
>>> <mailto:sdeach@adobe.com%3Esdeach@adobe.com>>
>>>       
>>>>>>>> No, in my personal opinion, it should not.
>>>>>>>> The 2 differences between normal space/nbsp vs ideographic space are:
>>>>>>>> 1.) The normal width is different, and
>>>>>>>> 2.) The normal space/nbsp is treated as justifying
>>>>>>>> (adjusted by both wordspacing and letterspacing),
>>>>>>>> whereas the Ideographic space should only be adjusted by
>>>>>>>> letterspacing (only if ideographic letters are also so adjusted).
>>>>>>>>
>>>>>>>> However, I will re-confirm this with our CJK experts, before
>>>>>>>>                 
>>> claiming this
>>>       
>>>>>>>> is an Adobe opinion.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2008.10.29 15:13, "fantasai"
>>>>>>>>
>>>>>>>>                 
>>> <<mailto:fantasai.lists@inkedblade.net>fantasai.lists@inkedblade.net
>>> <mailto:fantasai.lists@inkedblade.net%3Efantasai.lists@inkedblade.net>>
>>>       
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> The CSSWG would like to know whether the IDEOGRAPHIC SPACE U+3000
>>>>>>>>> should be affected by 'word-spacing', and whether it should be
>>>>>>>>> treated as a space during spaces-only justification or treated as
>>>>>>>>> a typical ideographic punctuation character.
>>>>>>>>>
>>>>>>>>> ~fantasai
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>> --
>>>>>>> KOBAYASHI Tatsuo
>>>>>>> Scholex Co., Ltd. Yokohama
>>>>>>> JUSTSYSTEM Digital Culture Research Center
>>>>>>>
>>>>>>>               
>>>>>> #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>>>>>> #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
>>>>>>
>>>>>>
>>>>>>             
>>>>>
>>>>>
>>>>>
>>>>>           
>
>
>
>   
Received on Saturday, 1 November 2008 01:36:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:18 GMT