W3C home > Mailing lists > Public > www-international@w3.org > October to December 2008

Re: Ideographic Space, word-spacing, and justification

From: Steve Deach <sdeach@adobe.com>
Date: Sat, 01 Nov 2008 08:32:22 -0700
To: Felix Sasaki <fsasaki@w3.org>
CC: Martin Duerst <duerst@it.aoyama.ac.jp>, "KOBAYASHI Tatsuo(FAMILY Given)" <tlk@kobysh.com>, fantasai <fantasai.lists@inkedblade.net>, WWW International <www-international@w3.org>, Paul Nelson <paulnel@winse.microsoft.com>, Michel Suignard <michel@unicode.org>
Message-ID: <C531C496.3EF4%sdeach@adobe.com>

Yes, I will re-review the document.

I believe Steve Zilles has been representing Adobe on the JLTF, and I will
discuss my comments with him.

One of the concerns I raised in the thread has been the "misuse" or
"mal-appropriation" of existing/traditional terminology (often
unintentional, and usually based on inadequate exposure to prior art). This
happened when traditional publishing was moved to DTP and is happening again
as DTP-level styling and layout moves to the Web.
  The situation gets worse when dealing with translations of technical terms
across languages, nuance is often accidentally lost or a generic term gets
substituted for a specialized one (or the reverse, a specialized term gets
used as generic). 
  The situation is also aggravated when a technology is viewed as
proprietary (often treated as trade secrets, as was the situation in the
traditional publishing industry in the U.S. in the 1970's-1980's), because
the only way to learn the terminology is to work for several companies in
the field so you can sort out what is common from what is "extended" by a
given vendor.

On 2008.11.01 01:09, "Felix Sasaki" <fsasaki@w3.org> wrote:

> Steve all,
> Tatsuo already pointed to the JLTF document in this thread, but I do it
> again, with a specific aspect in mind: terminology. See
> http://www.w3.org/TR/2008/WD-jlreq-20081015/#terminology-en
> http://www.w3.org/TR/2008/WD-jlreq-20081015/ja/#terminology-ja
> It would be great to get your feedback on this terminology (until Nov.
> 15) which is also used throughout the document(s). As Tatsuo already
> noted, the document describes an ideal of how Japanese layout should
> look like, and not historic variants. It aims also to be technology
> independent, so do not expect specific discussions on how something
> should be implemented, e.g. "styling properties versus use of space
> characters". We leave it to you guys to argue about that ... ;)
> Felix
> Steve Deach さんは書きました:
>> Every few years this issues comes back up. Unfortunately, I can't find the
>> rather long treatise I wrote the last time.
>> In general, I agree with Martin, that one should use styling properties as a
>> replacement for most of the "layout" uses of space characters (just as one
>> should use tables in place of most uses of tabs). That said, I would like to
>> briefly summarize the traditional (pre-DTP) handling of spaces and spacing,
>> and comment on "what I believe" to be the correct handling.
>> Second, I agree that the handling of letterspacing and wordspacing varies by
>> script and in some cases usage within a script, due to historic/cultural
>> differences in preferences/aesthetics, or specific readability requirements
>> for the usage, and the aesthetic desires of the designer.
>> This is a partial reconstruction of my prior emails on this topic.
>> My terminology:
>>   "Spacing" an adjustment to the distance between 2 glyphs/characters.
>>   "Space" a character which has a width but no visible inked representation.
>>   "Letterspacing" an adjustment to the intercharacter spacing used for
>>      line justification. [This definintion differs from CSS's.]
>>   "Wordspacing" an adjustment to the width of an interword space, also
>>      used for line justification.
>>   "WhiteSpaceAddition/Reduction (WAS/WSR)" a uniform adjustment to
>>      intercharacter spacing that is applied for design purposes or
>>      emphasis. [This corresponds most closely to the CSS-2.0 definition
>>      of letterspacing. Most DTP applications call this "Tracking".]
>>   "Tracking" and adjustment to intercharacter spacing which varies by
>>      fontsize/pointsize that is used to increase readability when
>>      optical sizing is not provided by the font. [This traditional
>>      definition differs from that used in most DTP applications.]
>> In setting Roman text:
>>   Letterspacing is not generally applied to Arabic (and other
>> connected-letter scripts/languages, nor to connected letter ("script") faces
>> in Roman-derivative scripts)
>>   Letterspacing is not generally applied to ideographic or similar
>> monospaced scripts, nor to monospaced text in Roman-derivative environments.
>>   Traditional applications varied widely in the algorithms used for
>> weighting how much of a justification adjustment was applied to wordspacing
>> vs to letterspacing. Most modern systems treat them as linear-proportional.
>>   Traditional publishing applications were also at odds over whether the
>> letterspacing adjustment AND the wordspacing adjustment should both be
>> applied to the space/NbSp characters, but most modern systems apply both.
>>   The Unicode NbSp (u+00a0) character should be treated the same as the
>> Unicode Space (u+0020). [In traditional publishing systems, these are
>> variable width in justified lines and fixed width in "aligned", tabular, and
>> math uses. However, some traditional publishing systems treat all space
>> characters prior to the first non-space in a line as fixed width.]
>>   The FigureSpace (u+2007), and PunctuationSpace (u+2008) are treated the
>> same way the corresponding figure '0' and punctuation period/full stop would
>> be treated in the current layout context (justified vs
>> aligned/tabular/math).
>>   Some traditional publishing systems had a quad-space and a
>> justifying-space (sometimes called a 'spaceband' rather than 'justifying
>> space'). Use of the quad-space within justified text would force the fixed
>> nominal-width of the normal interword space character, disabling
>> justification adjustments. This encoding concept has no analogy in Unicode.
>>   All other space characters {EM-space, EN, EM-quad, EN-quad, 3/EM, 4/EM,
>> 6/EM, Thin, & Hair} are treated as fixed width and are not adjusted for
>> letterspacing nor for wordspacing. (Traditional publishing systems used
>> these for alignment/layout and did not generally apply tracking nor WSA/WSR
>> either.)
>> Ideographic languages/scripts do not generally use wordspacing or
>> letterspacing to adjust justification; instead they typically use rules akin
>> to those described in JIS-4051 (latest). This algorithm involves trimming
>> some characters to half-width, then reinserting 1/2 & 1/4-em spacing
>> adjustments at selected points within the line.
>>   Under these rules, Ideographic-space is treated as an ideographic letter
>> [generally fixed-fullwidth, but has some specific additional rules], and not
>> as a roman variable space.
>>   It should be a styling option of whether Roman text embedded in
>> Ideographic text is set using Roman algorithms or Japanese/Chinese
>> algorithms. Depending on the publication and the publisher, Roman text may
>> be set proportional (using Roman or Asian justification rules), halfwidth,
>> or fullwidth. (Similarly, they may choose Asian or Roman word-breaking and
>> hyphenation rules.)
>> I have not covered any specifics in the handling of ancient languages that
>> are generally only of academic interest; nor the handling of Arabic and
>> Arabic-dervative scripts; nor Indic; nor certain other language-specific
>> differences (such as adjustments to spaces on sentence boundaries in some
>> uses not after certain punctuation characters in French and other
>> languages).
>> I have also not addressed the handling of "hanging punctuation" and "hanging
>> spaces"; though there are different philosophies/algorithm for handling
>> these across the various script families.
>> -- S.Deach
>>    sdeach@adobe.com
>> On 2008.10.31 02:43, "Martin Duerst" <duerst@it.aoyama.ac.jp> wrote:
>>> Hello everybody,
>>> Just a bit of a wider background on full-width space.
>>> It should be remembered that in contrast to the usual space (U+0020),
>>> which occurs all over the place in texts in most languages, the
>>> full-width space doesn't occur AT ALL in typical Japanese (or Chinese)
>>> texts. That's why it also barely occurs in the document written
>>> by the Japanese Layout TF, as well as in JIS 4501.
>>> The full-width space is more used for layout than inside the actual
>>> text. In this respect, what CSS should do is to mainly look at
>>> Japanese typography and try to come up with properties that allow
>>> to get rid of full-width spaces in the text, rather than spending
>>> too much time on how to treat full-width space.
>>> As a typical example, I guess lead typesetting and also definitely
>>> simple approaches to typesetting on the computer, such as plain
>>> text or old "word-processors" (which were not very much above
>>> plain text in their capabilities) use a full-width space to produce
>>> a start-of-paragraph indent (which is very often one full-width
>>> character wide). CSS should make sure that there is no need to
>>> insert such full-width spaces, because an exact one-full-width-
>>> character start-of-paragraph indent can be produced with an
>>> appropriate CSS property setting.
>>> Another typical use of full-width space was to center text,
>>> and to insert spaces into text for headlines (to a large
>>> extent a crude backup for increasing text size, which wasn't
>>> possible when technology was limited to one or two bit-mapped
>>> font sizes. In this case, inter-character spacing property(/ies)
>>> may be important for 'facsimile' layouts, but with modern
>>> technology, such layout isn't much used anymore anyway.
>>> Regards,   Martin.
>>> At 18:31 08/10/30, KOBAYASHI Tatsuo(FAMILY Given) wrote:
>>>> Hi, Erica,
>>>> In Japanese Layout, "spacing issue" is one of the most difficult issues to
>>>> treat.
>>>> We intended to carefully eliminate concrete character name like IDEOGRAPHIC
>>>> SPACE(U+3000) and SPACE(U+0002) from our requirement. Rather introduced
>>>> three
>>>> different types of abstract space concepts as follows:
>>>> inter character space: usulal 1/2 em fixed space.
>>>> conditional space: 1/2 em fixed space to be inserted or pulled off between
>>>> characters and punctuation marks.
>>>> adjustable space: variable width space, behaves like usual western variable
>>>> space.
>>>> Note that, usual Japanese punctuation marks have 1/2 em width in our
>>>> requirement, even if the character name might include "FULLWIDTH ~~~"
>>>> Anyway, the disition how to deal with these spaces in CSS recommendation
>>>> and
>>>> in actual implementation is up to your side:-)
>>>> regards,
>>>> Tatsuo
>>>> 2008/10/30 Steve Deach <<mailto:sdeach@adobe.com>sdeach@adobe.com>
>>>>> No, in my personal opinion, it should not.
>>>>> The 2 differences between normal space/nbsp vs ideographic space are:
>>>>> 1.) The normal width is different, and
>>>>> 2.) The normal space/nbsp is treated as justifying
>>>>>     (adjusted by both wordspacing and letterspacing),
>>>>>     whereas the Ideographic space should only be adjusted by
>>>>>     letterspacing (only if ideographic letters are also so adjusted).
>>>>> However, I will re-confirm this with our CJK experts, before claiming this
>>>>> is an Adobe opinion.
>>>>> On 2008.10.29 15:13, "fantasai"
>>>>> <<mailto:fantasai.lists@inkedblade.net>fantasai.lists@inkedblade.net>
>>>>> wrote:
>>>>>> Hello,
>>>>>> The CSSWG would like to know whether the IDEOGRAPHIC SPACE U+3000
>>>>>> should be affected by 'word-spacing', and whether it should be
>>>>>> treated as a space during spaces-only justification or treated as
>>>>>> a typical ideographic punctuation character.
>>>>>> ~fantasai
>>>> -- 
>>>> KOBAYASHI Tatsuo
>>>> Scholex Co., Ltd. Yokohama
>>>> JUSTSYSTEM Digital Culture Research Center
>>> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>>> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
Received on Saturday, 1 November 2008 15:33:45 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:56 UTC