RE: accented characters, etc.

John Delacour wrote:

> Without going so far as using Unicode, you can also declare some
> other character set such as iso-8858-2 and use decimal or hexadecimal
> character entities, eg. ƒ  &x83;

Hmmm.  That ƒ falls in the same category as using — for em dashes --
avoid it.  Those are Windows codepage characters and undefined in Markup:

	<?xml version="1.0"?>
	<doc>
	  <p>&#131; is an undefined entity.</p>
	</doc>

nsgmls -wxml gives you a warning of "reference to non-SGML character", and it
displays as a square placeholder box in IE5 (even with other declared encodings
like iso-8859-x instead of the default UTF-8).

Avoid anything in the 128 - 159 range, and -- for completeness -- the 0 - 31
range (except for tabs (9), linefeeds (10), and carriage returns (13)).

/Jelks

Received on Thursday, 2 December 1999 23:11:29 UTC