W3C home > Mailing lists > Public > www-international@w3.org > July to September 2001

Re: Unicode character names

From: Martin Duerst <duerst@w3.org>
Date: Mon, 27 Aug 2001 10:04:45 +0900
Message-Id: <4.2.0.58.J.20010827095223.039dd8f0@133.27.195.38>
To: Susan Lesch <lesch@w3.org>, www-international@w3.org
Cc: fyergeau@alis.com
Hello Susan,

PERIOD is capitalized because both the ISO 10646/Unicode
character names and the Unicode 1.0 names are usually capitalized.

Period is no better or worse than any of the other listed aliases
(dot, decimal point); it's just not unnecessarily listed twice.

PERIOD is outdated as the official character name; it was the
official character name in Unicode 1.0, but since Unicode 1.1,
and in ISO 10646, it was always FULL STOP. Period is not outdated
as a plain name/alias for that character.

As for the guidelines: I would give the following points, in order
of importance:

- When specifying characters, specifications must do so by referring
   to Unicode/ISO 10646.
- For such references, the codepoint value (usually in the form U+hhhh),
   where applicable the official name, and where helpful, one or more
   aliases should be used.
   [I don't want to say that names have to be used always, because they
    are a waste of space for CJK ideographs, and also don't make sense
    for long lists. Also, the choice of aliases should be left to the
    editor, sometimes, aliases are close to indispensable (solidus -> slash),
    sometimes they are closer to curiosities (# -> octotrope).
    The XML spec doesn't use U+hhhh, so this form should also not be 
mandatory.]

Regards,   Martin.

At 18:39 01/08/25 -0700, Susan Lesch wrote:
>Hello,
>
>I am preparing a W3C publications guide, and would like to link to your
>response to the question below. I plan to recommend that editors of W3C
>specifications refer to characters by their correct Unicode names.
>
> From _The Unicode Standard, Version 3.0_ (sorry that is the latest hard
>copy available to me at this time), page 101:
>
>   "The Unicode 1.0 character name is an informative property of the
>   characters defined in Version 1.0 of the Unicode Standard. The
>   names of Unicode characters were changed in the process of merging
>   the standard with ISO/IEC 10646. The Version 1.0 character names
>   can be obtained from the CD-ROM accompanying the standard or from
>   the ftp site. See also Appendix D, Changes from Unicode Version
>   2.0. Where the Version 1.0 character name provides additional
>   useful information, it is listed in Chapter 14, Code Charts. For
>   example, U+00B6 PILCROW SIGN has its Version 1.0 name, PARAGRAPH
>   SIGN, listed for clarity."
>
>To select an example with more variables from the ftp site at
>ftp://ftp.unicode.org/Public/3.1-Update/NamesList-3.1.0.txt
>
>002E    FULL STOP
>         = PERIOD
>         = dot, decimal point
>         * may be rendered as a raised decimal point in old style numbers
>         x (arabic full stop - 06D4)
>         x (ideographic full stop - 3002)
>
>Thanks to the I18n Working Group, I learned that PERIOD is the Unicode
>1.0 name, and "dot" and "decimal point" are acceptable aliases.
>
>My question is this. Is PERIOD outdated? Is it correct to refer to this
>character (.) as "full stop, dot, or decimal point, and NOT period"?
>(Or is PERIOD capitalized to show it is the best alias?)
>
>Thank you,
>--
>Susan Lesch - mailto:lesch@w3.org  tel:+1.858.483.4819
>World Wide Web Consortium (W3C) - http://www.w3.org/
>
Received on Sunday, 26 August 2001 22:12:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:57 GMT