RE: Use of Symbols from Rowland Shaw on 2002-01-14 (www-html@w3.org from January 2002)

From: Rowland Shaw <Rowland.Shaw@crystaldecisions.com>
Date: Mon, 14 Jan 2002 03:28:38 -0800
To: "'Christian Wolfgang Hujer'" <Christian.Hujer@itcqis.com>, Geoff McNeil <g.mcneil1@ntlworld.com>, www-html@w3.org
Message-ID: <963A03BCAFF059488BAFF33AE5C8709707FF3A@IPSENT04>

From: http://www.w3.org/TR/html401/sgml/entities.html
<!ENTITY pound CDATA "&#163;" -- pound sign U+00A3 ISOnum -->

So:
You can use &pound; instead of &#163; for readability in HTML/4.01. Charset
*shouldn't* matter then (although you'd still need the underlying font to
have the character defined)


PS: Sorry for having an annoying mailer for replying :o)

-----Original Message-----
From: Christian Wolfgang Hujer [mailto:Christian.Hujer@itcqis.com]
Sent: 14 January 2002 10:53
To: Geoff McNeil; www-html@w3.org
Subject: RE: Use of Symbols


Hello Geoff,

> -----Original Message-----
> From: www-html-request@w3.org [mailto:www-html-request@w3.org]On Behalf Of
Geoff McNeil
> Sent: Friday, January 11, 2002 5:26 PM
> To: www-html@w3.org
> Subject: Use of Symbols
>
>
> On trying to validate my document i keep getting an error due to my use of
the £ sign.  Is there a code i should be using or another way of displaying
this image on my documents?


what charset have you used for encoding your document, and what charset have
you declared for your document? If it is not iso-8859-1 or a similar in both
cases, the well-formedness check of the validator will fail.

Usually the pound sign may not occur unencoded in an XML or HTML document,
it needs to be encoded using a character entity like this: &#163;

It is a good advice to always encode all non-ASCII-characters (ASCII is a 7
Bit encoding, ranging from ASCII/most ISO/Unicode characters 0-127) using
character entities, at least when the language used in the document mainly
uses a writing based on the latin alphabet. If the language for the document
does not use a writing based on the latin alphabet, using UTF-8 encoded
Unicode is a good alternative.

The charset you use must be declared like this (iso-8859-1 used for these
examples):

[examples snipped]

Beware not to use these "cp..." or "...windows..." charsets because these
are not just legacy charsets, these are proprietary charsets and won't be
understood by most browsers.
iso-8859-*, though still often in use, already is considered to be a legacy
charset. The future definitely is UTF-8 and UTF-16.

Received on Monday, 14 January 2002 06:29:11 UTC