Re: ERB discussions and decisions
David G. Durand wrote:
> We are talking about how a system should represent a character that is not
> in Unicode for transmission on the wire. Mathematical symbols are proabably
> the best non-scholarly example raised so far in the discussion.
Another good example might be place and personal names of Japanese and
these often cannot be written using characters in computer character
and must be spelled out in some alternative form. Imagine not even being
give your name using a computer!
For CJK characters it has been proposed (Prof. Eiji Matsuoka) that
the standard national character book (E.g. in Japan the Daiwa Kanten)
be used: these are collections with all known characters (or at, at
glyphs) and national variants.
So the character is identified by a string, either in the comment string
in the SDATA entity text value, giving the source book and an index into
it, e.g. "Daiwa Kanten, character 40000", or "Daiwa Kanten, 3rd ed.,
p24, char 2"
I guess. Other suggested methods are the telphone method of saying a
well-known character that the character looks like, then giving the
or building the character up from radicals. (And there is also the
idea, to actually send a rough sample of the glyph.) These let the
at the receiving end reconstruct the character. (As I have said before,
think a preferable method is that whoever makes a document undertakes to
put some kind of usable glyph on the web too, when some kind of WWW font
service mechanism is established.)
I am not objecting to nickname identifiers, expecially for
specialists sending material to each other. That is fine for documents
with occasional strange characters, and for logos and symbols.
But document series that have tens of thousands of non-Unicode
(e.g. the EBTI projects going on in CJK and Thailand) either the
or the SDATA text values have to have something that can be directly
to index large character tables: numbers are directly useful for this.
Mapping the ISO entities sets to system values is tedious enough, let
alone document-specific character entity sets.
Anyway, I think you ask if ever (unnumbered) nicknames
are not preferable to identifiers with numbers. I think there are
many such, certainly in documents outside the Western hemisphere.
Rick Jelliffe email: email@example.com
Allette Systems (Australia) email: firstname.lastname@example.org
Level 10, 91 York Street www: http://www.allette.com.au
Sydney 2000 NSW Australia phone: +61 2 262 4777
fax: +61 2 262 4774