W3C home > Mailing lists > Public > www-international@w3.org > October to December 2008

Re: Ideographic Space, word-spacing, and justification

From: Felix Sasaki <fsasaki@w3.org>
Date: Sun, 02 Nov 2008 00:57:21 +0900
Message-ID: <490C7C61.5050805@w3.org>
To: Steve Deach <sdeach@adobe.com>
CC: Martin Duerst <duerst@it.aoyama.ac.jp>, "KOBAYASHI Tatsuo(FAMILY Given)" <tlk@kobysh.com>, fantasai <fantasai.lists@inkedblade.net>, WWW International <www-international@w3.org>, Paul Nelson <paulnel@winse.microsoft.com>, Michel Suignard <michel@unicode.org>

Steve Deach さんは書きました:
> Yes, I will re-review the document.

Great, thank you!

> I believe Steve Zilles has been representing Adobe on the JLTF, and I will
> discuss my comments with him.

Steve has already contributed a lot to this document, and we are looking
forward to get more input from both of you.

> One of the concerns I raised in the thread has been the "misuse" or
> "mal-appropriation" of existing/traditional terminology (often
> unintentional, and usually based on inadequate exposure to prior art). This
> happened when traditional publishing was moved to DTP and is happening again
> as DTP-level styling and layout moves to the Web.
>   The situation gets worse when dealing with translations of technical terms
> across languages, nuance is often accidentally lost or a generic term gets
> substituted for a specialized one (or the reverse, a specialized term gets
> used as generic). 
>   The situation is also aggravated when a technology is viewed as
> proprietary (often treated as trade secrets, as was the situation in the
> traditional publishing industry in the U.S. in the 1970's-1980's), because
> the only way to learn the terminology is to work for several companies in
> the field so you can sort out what is common from what is "extended" by a
> given vendor.

What you are describing is clearly a main issue in the JLTF document. We
were (are) trying to address this by

a) describing a terminology which is *currently* in use in Japanese
typography ("existing/traditional terminology" problem)
b) in the JLTF document *not* relating the terminology to web technology
("DTP-level styling and layout moves to the Web" problem), but leaving
that to potentially a subsequent document
c) using translations only when the meaning is clear in English,
otherwise keep the Japanese term ("translation" problem)
I think we do not have the proprietary problem in the JLTF document, but
I might be wrong.


> On 2008.11.01 01:09, "Felix Sasaki" <fsasaki@w3.org> wrote:
>> Steve all,
>> Tatsuo already pointed to the JLTF document in this thread, but I do it
>> again, with a specific aspect in mind: terminology. See
>> http://www.w3.org/TR/2008/WD-jlreq-20081015/#terminology-en
>> http://www.w3.org/TR/2008/WD-jlreq-20081015/ja/#terminology-ja
>> It would be great to get your feedback on this terminology (until Nov.
>> 15) which is also used throughout the document(s). As Tatsuo already
>> noted, the document describes an ideal of how Japanese layout should
>> look like, and not historic variants. It aims also to be technology
>> independent, so do not expect specific discussions on how something
>> should be implemented, e.g. "styling properties versus use of space
>> characters". We leave it to you guys to argue about that ... ;)
>> Felix
>> Steve Deach さんは書きました:
>>> Every few years this issues comes back up. Unfortunately, I can't find the
>>> rather long treatise I wrote the last time.
>>> In general, I agree with Martin, that one should use styling properties as a
>>> replacement for most of the "layout" uses of space characters (just as one
>>> should use tables in place of most uses of tabs). That said, I would like to
>>> briefly summarize the traditional (pre-DTP) handling of spaces and spacing,
>>> and comment on "what I believe" to be the correct handling.
>>> Second, I agree that the handling of letterspacing and wordspacing varies by
>>> script and in some cases usage within a script, due to historic/cultural
>>> differences in preferences/aesthetics, or specific readability requirements
>>> for the usage, and the aesthetic desires of the designer.
>>> This is a partial reconstruction of my prior emails on this topic.
>>> My terminology:
>>>   "Spacing" an adjustment to the distance between 2 glyphs/characters.
>>>   "Space" a character which has a width but no visible inked representation.
>>>   "Letterspacing" an adjustment to the intercharacter spacing used for
>>>      line justification. [This definintion differs from CSS's.]
>>>   "Wordspacing" an adjustment to the width of an interword space, also
>>>      used for line justification.
>>>   "WhiteSpaceAddition/Reduction (WAS/WSR)" a uniform adjustment to
>>>      intercharacter spacing that is applied for design purposes or
>>>      emphasis. [This corresponds most closely to the CSS-2.0 definition
>>>      of letterspacing. Most DTP applications call this "Tracking".]
>>>   "Tracking" and adjustment to intercharacter spacing which varies by
>>>      fontsize/pointsize that is used to increase readability when
>>>      optical sizing is not provided by the font. [This traditional
>>>      definition differs from that used in most DTP applications.]
>>> In setting Roman text:
>>>   Letterspacing is not generally applied to Arabic (and other
>>> connected-letter scripts/languages, nor to connected letter ("script") faces
>>> in Roman-derivative scripts)
>>>   Letterspacing is not generally applied to ideographic or similar
>>> monospaced scripts, nor to monospaced text in Roman-derivative environments.
>>>   Traditional applications varied widely in the algorithms used for
>>> weighting how much of a justification adjustment was applied to wordspacing
>>> vs to letterspacing. Most modern systems treat them as linear-proportional.
>>>   Traditional publishing applications were also at odds over whether the
>>> letterspacing adjustment AND the wordspacing adjustment should both be
>>> applied to the space/NbSp characters, but most modern systems apply both.
>>>   The Unicode NbSp (u+00a0) character should be treated the same as the
>>> Unicode Space (u+0020). [In traditional publishing systems, these are
>>> variable width in justified lines and fixed width in "aligned", tabular, and
>>> math uses. However, some traditional publishing systems treat all space
>>> characters prior to the first non-space in a line as fixed width.]
>>>   The FigureSpace (u+2007), and PunctuationSpace (u+2008) are treated the
>>> same way the corresponding figure '0' and punctuation period/full stop would
>>> be treated in the current layout context (justified vs
>>> aligned/tabular/math).
>>>   Some traditional publishing systems had a quad-space and a
>>> justifying-space (sometimes called a 'spaceband' rather than 'justifying
>>> space'). Use of the quad-space within justified text would force the fixed
>>> nominal-width of the normal interword space character, disabling
>>> justification adjustments. This encoding concept has no analogy in Unicode.
>>>   All other space characters {EM-space, EN, EM-quad, EN-quad, 3/EM, 4/EM,
>>> 6/EM, Thin, & Hair} are treated as fixed width and are not adjusted for
>>> letterspacing nor for wordspacing. (Traditional publishing systems used
>>> these for alignment/layout and did not generally apply tracking nor WSA/WSR
>>> either.)
>>> Ideographic languages/scripts do not generally use wordspacing or
>>> letterspacing to adjust justification; instead they typically use rules akin
>>> to those described in JIS-4051 (latest). This algorithm involves trimming
>>> some characters to half-width, then reinserting 1/2 & 1/4-em spacing
>>> adjustments at selected points within the line.
>>>   Under these rules, Ideographic-space is treated as an ideographic letter
>>> [generally fixed-fullwidth, but has some specific additional rules], and not
>>> as a roman variable space.
>>>   It should be a styling option of whether Roman text embedded in
>>> Ideographic text is set using Roman algorithms or Japanese/Chinese
>>> algorithms. Depending on the publication and the publisher, Roman text may
>>> be set proportional (using Roman or Asian justification rules), halfwidth,
>>> or fullwidth. (Similarly, they may choose Asian or Roman word-breaking and
>>> hyphenation rules.)
>>> I have not covered any specifics in the handling of ancient languages that
>>> are generally only of academic interest; nor the handling of Arabic and
>>> Arabic-dervative scripts; nor Indic; nor certain other language-specific
>>> differences (such as adjustments to spaces on sentence boundaries in some
>>> uses not after certain punctuation characters in French and other
>>> languages).
>>> I have also not addressed the handling of "hanging punctuation" and "hanging
>>> spaces"; though there are different philosophies/algorithm for handling
>>> these across the various script families.
>>> -- S.Deach
>>>    sdeach@adobe.com
>>> On 2008.10.31 02:43, "Martin Duerst" <duerst@it.aoyama.ac.jp> wrote:
>>>> Hello everybody,
>>>> Just a bit of a wider background on full-width space.
>>>> It should be remembered that in contrast to the usual space (U+0020),
>>>> which occurs all over the place in texts in most languages, the
>>>> full-width space doesn't occur AT ALL in typical Japanese (or Chinese)
>>>> texts. That's why it also barely occurs in the document written
>>>> by the Japanese Layout TF, as well as in JIS 4501.
>>>> The full-width space is more used for layout than inside the actual
>>>> text. In this respect, what CSS should do is to mainly look at
>>>> Japanese typography and try to come up with properties that allow
>>>> to get rid of full-width spaces in the text, rather than spending
>>>> too much time on how to treat full-width space.
>>>> As a typical example, I guess lead typesetting and also definitely
>>>> simple approaches to typesetting on the computer, such as plain
>>>> text or old "word-processors" (which were not very much above
>>>> plain text in their capabilities) use a full-width space to produce
>>>> a start-of-paragraph indent (which is very often one full-width
>>>> character wide). CSS should make sure that there is no need to
>>>> insert such full-width spaces, because an exact one-full-width-
>>>> character start-of-paragraph indent can be produced with an
>>>> appropriate CSS property setting.
>>>> Another typical use of full-width space was to center text,
>>>> and to insert spaces into text for headlines (to a large
>>>> extent a crude backup for increasing text size, which wasn't
>>>> possible when technology was limited to one or two bit-mapped
>>>> font sizes. In this case, inter-character spacing property(/ies)
>>>> may be important for 'facsimile' layouts, but with modern
>>>> technology, such layout isn't much used anymore anyway.
>>>> Regards,   Martin.
>>>> At 18:31 08/10/30, KOBAYASHI Tatsuo(FAMILY Given) wrote:
>>>>> Hi, Erica,
>>>>> In Japanese Layout, "spacing issue" is one of the most difficult issues to
>>>>> treat.
>>>>> We intended to carefully eliminate concrete character name like IDEOGRAPHIC
>>>>> SPACE(U+3000) and SPACE(U+0002) from our requirement. Rather introduced
>>>>> three
>>>>> different types of abstract space concepts as follows:
>>>>> inter character space: usulal 1/2 em fixed space.
>>>>> conditional space: 1/2 em fixed space to be inserted or pulled off between
>>>>> characters and punctuation marks.
>>>>> adjustable space: variable width space, behaves like usual western variable
>>>>> space.
>>>>> Note that, usual Japanese punctuation marks have 1/2 em width in our
>>>>> requirement, even if the character name might include "FULLWIDTH ~~~"
>>>>> Anyway, the disition how to deal with these spaces in CSS recommendation
>>>>> and
>>>>> in actual implementation is up to your side:-)
>>>>> regards,
>>>>> Tatsuo
>>>>> 2008/10/30 Steve Deach <<mailto:sdeach@adobe.com>sdeach@adobe.com>
>>>>>> No, in my personal opinion, it should not.
>>>>>> The 2 differences between normal space/nbsp vs ideographic space are:
>>>>>> 1.) The normal width is different, and
>>>>>> 2.) The normal space/nbsp is treated as justifying
>>>>>>     (adjusted by both wordspacing and letterspacing),
>>>>>>     whereas the Ideographic space should only be adjusted by
>>>>>>     letterspacing (only if ideographic letters are also so adjusted).
>>>>>> However, I will re-confirm this with our CJK experts, before claiming this
>>>>>> is an Adobe opinion.
>>>>>> On 2008.10.29 15:13, "fantasai"
>>>>>> <<mailto:fantasai.lists@inkedblade.net>fantasai.lists@inkedblade.net>
>>>>>> wrote:
>>>>>>> Hello,
>>>>>>> The CSSWG would like to know whether the IDEOGRAPHIC SPACE U+3000
>>>>>>> should be affected by 'word-spacing', and whether it should be
>>>>>>> treated as a space during spaces-only justification or treated as
>>>>>>> a typical ideographic punctuation character.
>>>>>>> ~fantasai
>>>>> -- 
>>>>> KOBAYASHI Tatsuo
>>>>> Scholex Co., Ltd. Yokohama
>>>>> JUSTSYSTEM Digital Culture Research Center
>>>> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>>>> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Received on Saturday, 1 November 2008 15:58:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:56 UTC