- From: Ian Young <imy@wcl-rs.bham.ac.uk>
- Date: Thu, 11 Jul 1996 21:04:56 +0100
- To: www-html@w3.org
Well, after some further digging I found the rather obviously named May 21 1996 file, <ftp://unicode.org/pub/MappingTables/Unicode-2.0.12.txt> (also available compressed). There's also an obsolete Unicode -> SGML mapping table, <ftp://unicode.org/pub/MappingTables/Obsolete/uni2sgml.txt> Could anyone comment on the continued applicability this file? (I suppose that's a question best addressed to Unicode Inc.) My SGML character enity list came from <http://www.infocom.net/%7Ebbs/iso8879.txt> -- is there a canonical site for this document? Like www.iso.ch, perhaps? :-/ Relevant excerpts from uni2sgml.txt (all these Unicode characters appear essentially the same in Unicode-2.0.12.txt): UNIC 6862.2 SGML Unicode character name 2500 boxh FORMS LIGHT HORIZONTAL 2502 boxv FORMS LIGHT VERTICAL 250C boxdr FORMS LIGHT DOWN AND RIGHT 2510 boxdl FORMS LIGHT DOWN AND LEFT 2514 boxur FORMS LIGHT UP AND RIGHT 2518 boxul FORMS LIGHT UP AND LEFT 251C boxvr FORMS LIGHT VERTICAL AND RIGHT 2524 boxvl FORMS LIGHT VERTICAL AND LEFT 252C boxhd FORMS LIGHT DOWN AND HORIZONTAL 2534 boxhu FORMS LIGHT UP AND HORIZONTAL 253C boxvh FORMS LIGHT VERTICAL AND HORIZONTAL 22C4 diam DIAMOND OPERATOR 2662 diams WHITE DIAMOND SUIT 2726 lozf BLACK FOUR POINTED STAR 2727 loz WHITE FOUR POINTED STAR 2022 bull BULLET 2190 larr LEFT ARROW 2191 uarr UP ARROW 2192 rarr RIGHT ARROW 2193 darr DOWN ARROW 25AA squf BLACK SMALL SQUARE On 11 Jul 1996, Eric S. Raymond <esr@snark.thyrsus.com> wrote: > Are these existing entities or proposals? They're existing character entities in SGML. With the exception of • and the arrows, I don't think anyone's proposed their use in HTML (except as part of a proposal to adopt all SGML character entities). The future use of Unicode as the document character set for HTML may make the choice between an SGML character entity and a numeric character reference moot. However, without a list of such entities a browser that isn't using Unicode won't know which numeric references beyond ÿ it should try to render. Actually, that's not quite true. Rather, the _programmers_ of certain browsers won't feel in the least inclined to search through Unicode for characters they can attempt in the absense of full Unicode interpretation and font support. e.g. all the 'extra' characters in Microsoft Codepage 1252 (a.k.a Windows Latin 1) that various blackguards advocate in certain newsgroups. >>> diamond ACS_DIAMOND > > 25CA should do fine. [...] > I guess loz corresponds to 25CA? Apparently not (see above). >>> checker board (stipple) ACS_CKBOARD Is the stippledness intrinsic? There are some shaded boxes 2591 blk14 LIGHT SHADE 2592 blk12 MEDIUM SHADE 2593 blk34 DARK SHADE Cheers, I. -- Avocation: advocating advocaat to avocets.
Received on Thursday, 11 July 1996 16:06:04 UTC