W3C home > Mailing lists > Public > public-html@w3.org > June 2009

Re: Non-character NCRs treated differently from literal non-characters

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 10 Jun 2009 04:49:55 +0000 (UTC)
To: Henri Sivonen <hsivonen@iki.fi>
Cc: "public-html@w3.org WG" <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0906100449070.1648@hixie.dreamhostps.com>
On Mon, 18 May 2009, Henri Sivonen wrote:
>
> Literal non-characters don't turn into REPLACEMENT CHARACTER

As far as I can tell, they do:

# Bytes or sequences of bytes in the original byte stream that could not 
# be converted to Unicode characters must be converted to U+FFFD 
# REPLACEMENT CHARACTER code points.

(Unless I'm misunderstanding what you mean?)


> But why, then, do non-character NCRs turn into REPLACEMENT CHARACTER?

For consistency.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 10 June 2009 04:50:30 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:04 UTC