W3C home > Mailing lists > Public > www-html@w3.org > March 1999

Re: Dynamic Font Size

From: Chris Maden <crism@oreilly.com>
Date: Tue, 23 Mar 1999 10:49:07 -0500 (EST)
Message-Id: <199903231549.KAA02828@ruby.ora.com>
To: www-html@w3.org
[Alan G. Isaac]
> I'm looking at
>   http://www.w3.org/TR/REC-html40/sgml/entities.html
> Why is it &emsp; and &ensp; but   &mdash; and &ndash; ?

Good question.  Unfortunately, it's not really subject to debate;
those entity names are from the ISO standard entity sets.  Some of the
names are twisted because they were subject to a six-character name
restriction (not for any really good reason), but I've never really
been clear on why mdash and ndash are named that way.

> Why are  &mdash; and &ndash; equivalent to &#8212; and &#8211;
> instead of &#151; and &#152; ?

Because they should work on operating systems that don't come from

> Why are no character entity references below &#160; listed on this
> page?

Same reason.  The 8-bit ISO character sets (ISO 8859-*, 8859-1 is
western European) reserve 129-159 as control characters.  Windows uses
different character sets (CP 12*, CP 1252 for western European).
Since it doesn't need the upper control characters, it uses that range
for characters missing from the corresponding ISO sets, like oe
ligatures, en and em dashes, ellipses, and s and z caron.

A numeric character reference (&#...;) is a reference to the document
character set, not the encoding; in HTML 4, the document character set
is Unicode (regardless of what bytes are actually used to store and
transmit the characters).  Unicode has a control character at code
point 151, but an em dash at code point 8212.

<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>
Received on Friday, 26 March 1999 15:10:43 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:05:50 UTC