Re: Non-character NCRs treated differently from literal non-characters from Ian Hickson on 2009-06-10 (public-html@w3.org from June 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 10 Jun 2009 04:49:55 +0000 (UTC)
To: Henri Sivonen <hsivonen@iki.fi>
Cc: "public-html@w3.org WG" <public-html@w3.org>
Message-ID: <Pine.LNX.4.62.0906100449070.1648@hixie.dreamhostps.com>

On Mon, 18 May 2009, Henri Sivonen wrote:
>
> Literal non-characters don't turn into REPLACEMENT CHARACTER

As far as I can tell, they do:

# Bytes or sequences of bytes in the original byte stream that could not 
# be converted to Unicode characters must be converted to U+FFFD 
# REPLACEMENT CHARACTER code points.

(Unless I'm misunderstanding what you mean?)

> But why, then, do non-character NCRs turn into REPLACEMENT CHARACTER?

For consistency.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 10 June 2009 04:50:30 UTC