A million Unicode standards?

There sure are a lot of "Unicode"s out there. Almost makes the name meaningless!
Reminds me of the quote: "The nice thing about standards is that there are
so many to choose from." Anyone wish to comment? I just came across the
following messages on AOL's MUT forum:


Subj:  requirements/performance
Date:  96-01-25 23:53:55 EST
From:  AFA Susan

AFL GeneS, or anyone,

Re: "ATM 3.9 simply adds Kanji support and is otherwise unchanged from ATM
3.8.3 (but uses more RAM as a result of the added resources)."

Do you happen to know how much more RAM and storage a 2-byte character set
requires than 1-byte? How much more it takes to display? How about a 4-byte
set? I sent a similar question to two people at Claris recently on a
recommendation from MDV and received no answer. What I am interested to
know are the requirements for Unicode 2.0 implementation, which I have been
told is a variable-length encoding scheme, as well as an old scheme I once
studied for a true 4-byte character repertoire and can't seem to forget as
a possibility for the future. Would TrueType or some other outline method
have any less memory or storage requirements or performance hit than ATM
and PostScript? Thanks.

---

Subj:  Re:requirements/performance
Date:  96-01-26 05:06:15 EST
From:  AFL GeneS

>  Would TrueType or some other outline method have any less memory or
> storage requirements or performance hit than ATM and PostScript?

The only other outline method for you is TrueType, which is a system level
feature. ATM has no performance hit, however.

The added RAM bite of ATM 3.9 over 3.8.3 is probably a couple of hundred K
or thereabouts.

---

Subj:  Re:requirements/performance
Date:  96-01-26 10:58:09 EST
From:  AFC Sushi

Susan,

I'm missing something here...

To reveal my ignorance, what is the "4-byte character repertoire?" Which
language/script uses it?

---

Subj:  Re:requirements/performance
Date:  96-01-26 18:54:05 EST
From:  AFA Susan

AFC Sushi,

Thank you for asking and I hope this explains. I already revealed my
ignorance! :-) The scheme I refer to is only one of perhaps many launched
during the 1980s. I am not at liberty to name the developer, but it is
certainly fine to explain here that that long ago forward-looking people
hoped to avoid the kind of problem that today forces Apple and the Unicode
Consortium to have to revise the 1.0 standard which allows for 65,536 code
positions or characters. I don't know of any defined 4-byte set of
characters in use, only a straightforward way to allow for 2^20 = 4
billion+ (4,294,967,296) possible glyphs.

As I read it, the UTF-16 proposal possibly about to become Unicode 2.0
allows for about 1 million (1,048,576). There are still 60 known
unsupported scripts listed in the working documents. You very likely will
know better than I, who only recently learned that Japanese Kanji and
Chinese can share characters, what the possibilities may be, including
not-writing characters, like the seemingly endless curious and often useful
characters in the old CompuGraphic dingbat fonts.

I would be in your debt for yours and anyone's opinion and guesses here.
What RAM, storage, and performance hit would a 4-byte repertoire demand,
and can you think of a use for all those characters now? I ask everybody!
Even Eric Schmidt, Sun's chief technology officer in a Keyword: BW Business
Week conference last month. But I frankly still have no idea if it is
reasonable to continue along the thinking from the 80s. :-)

---


__________________________________________________________________________
    Walter Ian Kaye <boo@best.com>     Programmer - Excel, AppleScript,
          Mountain View, CA                         ProTERM, FoxPro, HTML
 http://www.natural-innovations.com/     Musician - Guitarist, Songwriter

Received on Saturday, 27 January 1996 05:39:18 UTC