W3C home > Mailing lists > Public > www-international@w3.org > April to June 2004

Re: Windows and Mac character encoding questions

From: John Cowan <cowan@ccil.org>
Date: Tue, 6 Apr 2004 15:10:16 -0400
To: Chris Lilley <chris@w3.org>
Cc: Frank Ellermann <nobody@xyzzy.claranet.de>, www-international@w3.org
Message-ID: <20040406191014.GA29076@ccil.org>

Chris Lilley scripsit:

> On Tuesday, April 6, 2004, 11:23:43 AM, Frank wrote:
> 
> FE> Mark Davis wrote:
> 
>  >> In practice, however, the bytes 0x80..0x9F in iso-8859-1 are
>  >> so rarely used
> 
> FE> ...they're even illegal in XML 1.0, aren't they ?  Therefore...
> 
> The characters at those positions in the UCS are illegal. Bytes with
> those values in a given encoding are not illegal.

In fact, the characters are not illegal in XML 1.0 either.  IMHO they
should have been, but the point was overlooked (or not understood)
when XML 1.0 was written.

In XML 1.1, the characters may appear only as character references,
not literally, with the exception of U+0085 which is mapped to U+000A.

-- 
Some people open all the Windows;       John Cowan
wise wives welcome the spring           jcowan@reutershealth.com
by moving the Unix.                     http://www.reutershealth.com
  --ad for Unix Book Units (U.K.)       http://www.ccil.org/~cowan
        (see http://cm.bell-labs.com/cm/cs/who/dmr/unix3image.gif)
Received on Tuesday, 6 April 2004 15:13:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:03 GMT