W3C home > Mailing lists > Public > www-international@w3.org > October to December 2008

Re: Ideographic Space, word-spacing, and justification

From: Steve Deach <sdeach@adobe.com>
Date: Fri, 31 Oct 2008 15:01:32 -0700
To: Asmus Freytag <asmusf@ix.netcom.com>
CC: Martin Duerst <duerst@it.aoyama.ac.jp>, "KOBAYASHI Tatsuo(FAMILY Given)" <tlk@kobysh.com>, fantasai <fantasai.lists@inkedblade.net>, WWW International <www-international@w3.org>, Paul Nelson <paulnel@winse.microsoft.com>, Michel Suignard <michel@unicode.org>
Message-ID: <C530CE4C.3EB6%sdeach@adobe.com>

Replies interleaved below.

On 2008.10.31 14:30, "Asmus Freytag" <asmusf@ix.netcom.com> wrote:

> On 10/31/2008 1:50 PM, Steve Deach wrote:
>> Exactly what I said under WSA/WSR. In some languages, this is used for
>> emphasis .
> My point was that because they conflict visually, when WSA is used,
> letterspacing should either not be permitted, should be contraction
> only, or should be kept below a threshold that prevents it from being
> confused with WSA.

No, WSA is the emphasis usage. It is an explicit style override on a span of
a fixed amount (basically CSS-2.0 letterspacing). It is applied first,
before any line-breaking decision is made. If the line then requires
justification, traditional letterspacing/wordspacing (as is in XSL &
proposed in CSS3) is applied in addition. (see more detail below)

>> There are ³country² differences, ³language² differences, ³script², and
>> ³wild hare² (random designer-/instance-specific) differences in
>> everything related to text composition (styling & layout).
> Agreed - it is helpful, though, to amass as much detailed input on known
> systematic differences in typical text usage conventions. I find that
> much more helpful than merely saying "watch out - something may depend
> on something".

--Sorry, I don't understand the referents for the first sentence. I think
you are saying that you can't "automate" this processing, which I basically
agree with because you can't fully determine the aggregat of
language/country/... combinations in a systematic manner -- Since at least
one of these parameters is "wild hare" (and perfectly resolving
reader/designer conflicts in locale is impossible), you need explicit
options there may be for some of these styling decisions, because even in a
well-defined locale 2 or more viable choices or the non-standard choice may
be intended to attract attention (or other designer reason).

--I'll answer the second sentence:
  In this case, all the adjustments are "additive". For any property that
has a single value or an optimal value (in a min/opt/max triplet), the
single/optimal value is applied before making the line-break decision; then
the min/max values are used to determine a "window" on the break-point; then
once the break is chosen, the min/max values are used to readjust the
spacing to make the line justify. (Its been that way in traditional
publishing systems for 32 years, at least.)

> A./
>> (However, 20+ years ago, no one was very careful about those
>> distinctions; so I think I used [or intended to use] script/language
>> in the message below to indicate the distinctions were fuzzy. The same
>> comment applies to ³letter², ²character², vs. ³glyph²; so read my
>> email using the traditional ³fuzzy² definitions vs. the current ones.)
>> On 2008.10.31 12:46, "Asmus Freytag" <asmusf@ix.netcom.com> wrote:
>>> An aside on "letterspacing":
>>> The use of this is language dependent! (Not just script dependent).
>>> In German, it the use of increased letterspace for e m p h a s i s
>>> (like this) has traditionally been used with both Fraktur and roman
>>> style fonts. The practice is apparently still alive and well, because
>>> you find it use in electronic forums on the web - a rather modern use of
>>> text. Letterspacing, unless kept below very tight thresholds, is
>>> therefore c o n f u s i n g to readers expecting it to denote emphasis.
>>> Other Northern European languages may have similar issues, but I don't
>>> have first hand knowledge of current practices.
>>> A./
>>> On 10/31/2008 12:04 PM, Steve Deach wrote:
>>>> Every few years this issues comes back up. Unfortunately, I can't
>> find the
>>>> rather long treatise I wrote the last time.
>>>> In general, I agree with Martin, that one should use styling
>> properties as a
>>>> replacement for most of the "layout" uses of space characters (just
>> as one
>>>> should use tables in place of most uses of tabs). That said, I would
>> like to
>>>> briefly summarize the traditional (pre-DTP) handling of spaces and
>> spacing,
>>>> and comment on "what I believe" to be the correct handling.
>>>> Second, I agree that the handling of letterspacing and wordspacing
>> varies by
>>>> script and in some cases usage within a script, due to historic/cultural
>>>> differences in preferences/aesthetics, or specific readability
>> requirements
>>>> for the usage, and the aesthetic desires of the designer.
>>>> This is a partial reconstruction of my prior emails on this topic.
>>>> My terminology:
>>>> "Spacing" an adjustment to the distance between 2 glyphs/characters.
>>>> "Space" a character which has a width but no visible inked
>> representation.
>>>> "Letterspacing" an adjustment to the intercharacter spacing used for
>>>> line justification. [This definintion differs from CSS's.]
>>>> "Wordspacing" an adjustment to the width of an interword space, also
>>>> used for line justification.
>>>> "WhiteSpaceAddition/Reduction ( WSA/ WSR)" a uniform adjustment to
>>>> intercharacter spacing that is applied for design purposes or
>>>> emphasis . [This corresponds most closely to the CSS-2.0 definition
>>>> of letterspacing. Most DTP applications call this "Tracking".]
>>>> "Tracking" and adjustment to intercharacter spacing which varies by
>>>> fontsize/pointsize that is used to increase readability when
>>>> optical sizing is not provided by the font. [This traditional
>>>> definition differs from that used in most DTP applications.]
>>>> In setting Roman text:
>>>> Letterspacing is not generally applied to Arabic (and other
>>>> connected-letter scripts/languages, nor to connected letter
>> ("script") faces
>>>> in Roman-derivative scripts)
>>>> Letterspacing is not generally applied to ideographic or similar
>>>> monospaced scripts, nor to monospaced text in Roman-derivative
>> environments.
>>>> Traditional applications varied widely in the algorithms used for
>>>> weighting how much of a justification adjustment was applied to
>> wordspacing
>>>> vs to letterspacing. Most modern systems treat them as
>> linear-proportional.
>>>> Traditional publishing applications were also at odds over whether the
>>>> letterspacing adjustment AND the wordspacing adjustment should both be
>>>> applied to the space/NbSp characters, but most modern systems apply
>> both.
>>>> The Unicode NbSp (u+00a0) character should be treated the same as the
>>>> Unicode Space (u+0020). [In traditional publishing systems, these are
>>>> variable width in justified lines and fixed width in "aligned",
>> tabular, and
>>>> math uses. However, some traditional publishing systems treat all space
>>>> characters prior to the first non-space in a line as fixed width.]
>>>> The FigureSpace (u+2007), and PunctuationSpace (u+2008) are treated the
>>>> same way the corresponding figure '0' and punctuation period/full
>> stop would
>>>> be treated in the current layout context (justified vs
>>>> aligned/tabular/math).
>>>> Some traditional publishing systems had a quad-space and a
>>>> justifying-space (sometimes called a 'spaceband' rather than 'justifying
>>>> space'). Use of the quad-space within justified text would force the
>> fixed
>>>> nominal-width of the normal interword space character, disabling
>>>> justification adjustments. This encoding concept has no analogy in
>> Unicode.
>>>> All other space characters {EM-space, EN, EM-quad, EN-quad, 3/EM, 4/EM,
>>>> 6/EM, Thin, & Hair} are treated as fixed width and are not adjusted for
>>>> letterspacing nor for wordspacing. (Traditional publishing systems used
>>>> these for alignment/layout and did not generally apply tracking nor
>>>> either.)
>>>> Ideographic languages/scripts do not generally use wordspacing or
>>>> letterspacing to adjust justification; instead they typically use
>> rules akin
>>>> to those described in JIS-4051 (latest). This algorithm involves
>> trimming
>>>> some characters to half-width, then reinserting 1/2 & 1/4-em spacing
>>>> adjustments at selected points within the line.
>>>> Under these rules, Ideographic-space is treated as an ideographic letter
>>>> [generally fixed-fullwidth, but has some specific additional rules],
>> and not
>>>> as a roman variable space.
>>>> It should be a styling option of whether Roman text embedded in
>>>> Ideographic text is set using Roman algorithms or Japanese/Chinese
>>>> algorithms. Depending on the publication and the publisher, Roman
>> text may
>>>> be set proportional (using Roman or Asian justification rules),
>> halfwidth,
>>>> or fullwidth. (Similarly, they may choose Asian or Roman
>> word-breaking and
>>>> hyphenation rules.)
>>>> I have not covered any specifics in the handling of ancient
>> languages that
>>>> are generally only of academic interest; nor the handling of Arabic and
>>>> Arabic-dervative scripts; nor Indic; nor certain other language-specific
>>>> differences (such as adjustments to spaces on sentence boundaries in
>> some
>>>> uses , nor after certain punctuation characters in French and other
>>>> languages).
>>>> I have also not addressed the handling of "hanging punctuation" and
>> "hanging
>>>> spaces"; though there are different philosophies/algorithm for handling
>>>> these across the various script families.
>>>> -- S.Deach
>>>> sdeach@adobe.com
>>>> On 2008.10.31 02:43, "Martin Duerst" <duerst@it.aoyama.ac.jp> wrote:
>>>>> Hello everybody,
>>>>> Just a bit of a wider background on full-width space.
>>>>> It should be remembered that in contrast to the usual space (U+0020),
>>>>> which occurs all over the place in texts in most languages, the
>>>>> full-width space doesn't occur AT ALL in typical Japanese (or Chinese)
>>>>> texts. That's why it also barely occurs in the document written
>>>>> by the Japanese Layout TF, as well as in JIS 4501.
>>>>> The full-width space is more used for layout than inside the actual
>>>>> text. In this respect, what CSS should do is to mainly look at
>>>>> Japanese typography and try to come up with properties that allow
>>>>> to get rid of full-width spaces in the text, rather than spending
>>>>> too much time on how to treat full-width space.
>>>>> As a typical example, I guess lead typesetting and also definitely
>>>>> simple approaches to typesetting on the computer, such as plain
>>>>> text or old "word-processors" (which were not very much above
>>>>> plain text in their capabilities) use a full-width space to produce
>>>>> a start-of-paragraph indent (which is very often one full-width
>>>>> character wide). CSS should make sure that there is no need to
>>>>> insert such full-width spaces, because an exact one-full-width-
>>>>> character start-of-paragraph indent can be produced with an
>>>>> appropriate CSS property setting.
>>>>> Another typical use of full-width space was to center text,
>>>>> and to insert spaces into text for headlines (to a large
>>>>> extent a crude backup for increasing text size, which wasn't
>>>>> possible when technology was limited to one or two bit-mapped
>>>>> font sizes. In this case, inter-character spacing property(/ies)
>>>>> may be important for 'facsimile' layouts, but with modern
>>>>> technology, such layout isn't much used anymore anyway.
>>>>> Regards, Martin.
>>>>> At 18:31 08/10/30, KOBAYASHI Tatsuo(FAMILY Given) wrote:
>>>>>> Hi, Erica,
>>>>>> In Japanese Layout, "spacing issue" is one of the most difficult
>> issues to
>>>>>> treat.
>>>>>> We intended to carefully eliminate concrete character name like
>>>>>> SPACE(U+3000) and SPACE(U+0002) from our requirement. Rather
>> introduced
>>>>>> three
>>>>>> different types of abstract space concepts as follows:
>>>>>> inter character space: usulal 1/2 em fixed space.
>>>>>> conditional space: 1/2 em fixed space to be inserted or pulled off
>> between
>>>>>> characters and punctuation marks.
>>>>>> adjustable space: variable width space, behaves like usual western
>> variable
>>>>>> space.
>>>>>> Note that, usual Japanese punctuation marks have 1/2 em width in our
>>>>>> requirement, even if the character name might include "FULLWIDTH ~~~"
>>>>>> Anyway, the disition how to deal with these spaces in CSS
>> recommendation
>>>>>> and
>>>>>> in actual implementation is up to your side:-)
>>>>>> regards,
>>>>>> Tatsuo
>>>>>> 2008/10/30 Steve Deach <<mailto:sdeach@adobe.com>sdeach@adobe.com
>> <mailto:sdeach@adobe.com%3Esdeach@adobe.com>>
>>>>>>> No, in my personal opinion, it should not.
>>>>>>> The 2 differences between normal space/nbsp vs ideographic space are:
>>>>>>> 1.) The normal width is different, and
>>>>>>> 2.) The normal space/nbsp is treated as justifying
>>>>>>> (adjusted by both wordspacing and letterspacing),
>>>>>>> whereas the Ideographic space should only be adjusted by
>>>>>>> letterspacing (only if ideographic letters are also so adjusted).
>>>>>>> However, I will re-confirm this with our CJK experts, before
>> claiming this
>>>>>>> is an Adobe opinion.
>>>>>>> On 2008.10.29 15:13, "fantasai"
>> <<mailto:fantasai.lists@inkedblade.net>fantasai.lists@inkedblade.net
>> <mailto:fantasai.lists@inkedblade.net%3Efantasai.lists@inkedblade.net>>
>>>>>>> wrote:
>>>>>>>> Hello,
>>>>>>>> The CSSWG would like to know whether the IDEOGRAPHIC SPACE U+3000
>>>>>>>> should be affected by 'word-spacing', and whether it should be
>>>>>>>> treated as a space during spaces-only justification or treated as
>>>>>>>> a typical ideographic punctuation character.
>>>>>>>> ~fantasai
>>>>>> --
>>>>>> KOBAYASHI Tatsuo
>>>>>> Scholex Co., Ltd. Yokohama
>>>>>> JUSTSYSTEM Digital Culture Research Center
>>>>> #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>>>>> #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 31 October 2008 22:02:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:18 GMT