Re: Unicode character names

Just noticed a typo in my message ("It's primary function " => "Its primary function") -- MD
—————
  ----- Original Message ----- 
  From: Mark Davis 
  To: www-international@w3.org ; Susan Lesch 
  Cc: fyergeau@alis.com 
  Sent: Saturday, August 25, 2001 22:12
  Subject: Re: Unicode character names


  The formal character name is designed to serve as an identification across different character encoding standards, especially within ISO standards. It's primary function is to be a unique identifier across these standards, and within the Unicode Standard. It may or may not be the best description of the function and usage of the character in the Standard. In many cases it is, and in some it is not. [The NameList alone is insufficient to determine the complete semantics of the character according; many characters have complex semantics and usage, and thus have detailed descriptions in a relevant script chapter or elsewhere in the standard.]

  The Unicode 1.0 name was the formal character name in the first version of Unicode. As the quote says, it was changed in the merger with ISO 10646, and can now be considered an alias -- but does not have a privilidged status among the aliases. Thus U+002E could be referred to as a period, as a full stop, as a dot, as a decimal point (in some countries), and so on. All are reasonable aliases, and FULL STOP is also the formal name. There may also be aliases that are not yet mentioned in the standard. For example:

  eroteme \Er"o*teme\, n. [Gr. ? question.] A mark indicating a question; a note of interrogation. 

  Source: Webster's Revised Unabridged Dictionary, © 1996, 1998 MICRA, Inc.

  Note: The convention in the Unicode Standard within body text is for aliases to be in italic, and the formal name in small caps.

  I hope this helps,

  Mark
  —————

  Γνῶθι σαυτόν — Θαλῆς
  [http://www.macchiato.com]
  ----- Original Message ----- 
  From: "Susan Lesch" <lesch@w3.org>
  To: <www-international@w3.org>
  Cc: <fyergeau@alis.com>
  Sent: Saturday, August 25, 2001 18:39
  Subject: Unicode character names


  > Hello,
  > 
  > I am preparing a W3C publications guide, and would like to link to your
  > response to the question below. I plan to recommend that editors of W3C
  > specifications refer to characters by their correct Unicode names.
  > 
  >  From _The Unicode Standard, Version 3.0_ (sorry that is the latest hard
  > copy available to me at this time), page 101:
  > 
  >    "The Unicode 1.0 character name is an informative property of the
  >    characters defined in Version 1.0 of the Unicode Standard. The
  >    names of Unicode characters were changed in the process of merging
  >    the standard with ISO/IEC 10646. The Version 1.0 character names
  >    can be obtained from the CD-ROM accompanying the standard or from
  >    the ftp site. See also Appendix D, Changes from Unicode Version
  >    2.0. Where the Version 1.0 character name provides additional
  >    useful information, it is listed in Chapter 14, Code Charts. For
  >    example, U+00B6 PILCROW SIGN has its Version 1.0 name, PARAGRAPH
  >    SIGN, listed for clarity."
  > 
  > To select an example with more variables from the ftp site at
  > ftp://ftp.unicode.org/Public/3.1-Update/NamesList-3.1.0.txt
  > 
  > 002E FULL STOP
  > = PERIOD
  > = dot, decimal point
  > * may be rendered as a raised decimal point in old style numbers
  > x (arabic full stop - 06D4)
  > x (ideographic full stop - 3002)
  > 
  > Thanks to the I18n Working Group, I learned that PERIOD is the Unicode
  > 1.0 name, and "dot" and "decimal point" are acceptable aliases.
  > 
  > My question is this. Is PERIOD outdated? Is it correct to refer to this
  > character (.) as "full stop, dot, or decimal point, and NOT period"?
  > (Or is PERIOD capitalized to show it is the best alias?)
  > 
  > Thank you,
  > -- 
  > Susan Lesch - mailto:lesch@w3.org  tel:+1.858.483.4819
  > World Wide Web Consortium (W3C) - http://www.w3.org/
  > 
  > 

Received on Sunday, 26 August 2001 12:08:11 UTC