Eric's ACS characters (Was: Re: Cougar DTD extra character entities)

Well, after some further digging I found the rather obviously named
May 21 1996 file, <ftp://unicode.org/pub/MappingTables/Unicode-2.0.12.txt> 
(also available compressed).

There's also an obsolete Unicode -> SGML mapping table,
<ftp://unicode.org/pub/MappingTables/Obsolete/uni2sgml.txt>
Could anyone comment on the continued applicability this file? (I
suppose that's a question best addressed to Unicode Inc.)

My SGML character enity list came from
<http://www.infocom.net/%7Ebbs/iso8879.txt> -- is there a canonical
site for this document? Like www.iso.ch, perhaps? :-/


Relevant excerpts from uni2sgml.txt (all these Unicode characters
appear essentially the same in Unicode-2.0.12.txt):

UNIC	6862.2	SGML	Unicode character name

2500		boxh	FORMS LIGHT HORIZONTAL
2502		boxv	FORMS LIGHT VERTICAL
250C		boxdr	FORMS LIGHT DOWN AND RIGHT
2510		boxdl	FORMS LIGHT DOWN AND LEFT
2514		boxur	FORMS LIGHT UP AND RIGHT
2518		boxul	FORMS LIGHT UP AND LEFT
251C		boxvr	FORMS LIGHT VERTICAL AND RIGHT
2524		boxvl	FORMS LIGHT VERTICAL AND LEFT
252C		boxhd	FORMS LIGHT DOWN AND HORIZONTAL
2534		boxhu	FORMS LIGHT UP AND HORIZONTAL
253C		boxvh	FORMS LIGHT VERTICAL AND HORIZONTAL

22C4		diam	DIAMOND OPERATOR
2662		diams	WHITE DIAMOND SUIT
2726		lozf	BLACK FOUR POINTED STAR
2727		loz	WHITE FOUR POINTED STAR

2022		bull	BULLET

2190		larr	LEFT ARROW
2191		uarr	UP ARROW
2192		rarr	RIGHT ARROW
2193		darr	DOWN ARROW

25AA		squf	BLACK SMALL SQUARE


On 11 Jul 1996, Eric S. Raymond <esr@snark.thyrsus.com> wrote:
> Are these existing entities or proposals?

They're existing character entities in SGML. With the exception of
&bull; and the arrows, I don't think anyone's proposed their use in
HTML (except as part of a proposal to adopt all SGML character
entities).

The future use of Unicode as the document character set for HTML may
make the choice between an SGML character entity and a numeric character
reference moot. However, without a list of such entities a browser that
isn't using Unicode won't know which numeric references beyond &#255;
it should try to render.
Actually, that's not quite true. Rather, the _programmers_ of certain
browsers won't feel in the least inclined to search through Unicode
for characters they can attempt in the absense of full Unicode
interpretation and font support. e.g. all the 'extra' characters in
Microsoft Codepage 1252 (a.k.a Windows Latin 1) that various
blackguards advocate in certain newsgroups.

>>> diamond                 ACS_DIAMOND
>
> 25CA should do fine.
[...]
> I guess loz corresponds to 25CA?

Apparently not (see above).

>>> checker board (stipple) ACS_CKBOARD

Is the stippledness intrinsic? There are some shaded boxes

2591		blk14	LIGHT SHADE
2592		blk12	MEDIUM SHADE
2593		blk34	DARK SHADE

Cheers,

I.
--
Avocation: advocating advocaat to avocets.

Received on Thursday, 11 July 1996 16:06:04 UTC