- From: Jim Taylor <JHTaylor@videodiscovery.com>
- Date: Thu, 25 Jul 1996 12:21:55 -0800
- To: www-html@w3.org
>>> Chung-Chieh Shan <t-chungs@microsoft.com> - 7/25/96 12:05 AM >>> >I am interested in the list of character entities that are/will be >included in HTML 3.2. In particular, I am working on computerization of >several Taiwanese languages, the romanization of which requires >diacritics to be placed over letters such as "m" and "n". Since there >are already entities like ´ and ` defined in >ftp://ftp.ifi.uio.no/pub/SGML/ENTITIES/ISOdia, I suppose the only >question is whether these entities will be included in HTML 3.2 (I'm >actually not absolutely sure that they haven't been included in previous >versions; I'd be very happy if they have), and -- if they will -- >whether any specific rendering behavior is to be specified by HTML. If >it is HTML's responsibility to specify rendering behavior for these >entities, I think the logical way to proceed is to follow Unicode's >placement of non-spacing marks, i.e., use m´ (rather than >´m) for m with acute above, and so on. Entities for these diacriticals have not been in any HTML standard, and are not in the experimental Cougar document[1]. However, these characters are included in the ISO 8859-1 repertoire, so you can directly use characters for the diacriticals, which should work in any browser correctly supporting 8859-1. If you want non-spacing diacriticals you could use numeric character references (from Unicode) but most browsers won't support them. acute: character 180 (´) acute, non-spacing:  grave: character 96 (`) grave, not-spacing:  Unicode also includes glyphs such as M with acute accent (Ḿ), but it's not likely you'll get many browsers that support that either. You could propose that the SGML entities for diacritials (ISO 8879:1986//ENTITIES Diacritical Marks//EN) [2] be added to HTML, but most of these are already included in the 8859-1 set and supported by decent browsers. I.e., why write ` when you can write `? <!ENTITY acute SDATA "[acute ]"--=acute accent--> <!ENTITY breve SDATA "[breve ]"--=breve--> <!ENTITY caron SDATA "[caron ]"--=caron--> <!ENTITY cedil SDATA "[cedil ]"--=cedilla--> <!ENTITY circ SDATA "[circ ]"--=circumflex accent--> <!ENTITY dblac SDATA "[dblac ]"--=double acute accent--> <!ENTITY die SDATA "[die ]"--=dieresis--> <!ENTITY dot SDATA "[dot ]"--=dot above--> <!ENTITY grave SDATA "[grave ]"--=grave accent--> <!ENTITY macr SDATA "[macr ]"--=macron--> <!ENTITY ogon SDATA "[ogon ]"--=ogonek--> <!ENTITY ring SDATA "[ring ]"--=ring--> <!ENTITY tilde SDATA "[tilde ]"--=tilde--> <!ENTITY uml SDATA "[uml ]"--=umlaut mark--> ----- [1] http://www.w3.org/pub/WWW/MarkUp/Cougar/HTML.dtd [2] ftp://ftp.ifi.uio.no/pub/SGML/ENTITIES/ISOdia
Received on Thursday, 25 July 1996 15:22:48 UTC