Re: Non-character NCRs treated differently from literal non-characters

On Mon, 18 May 2009, Henri Sivonen wrote:
>
> Literal non-characters don't turn into REPLACEMENT CHARACTER

As far as I can tell, they do:

# Bytes or sequences of bytes in the original byte stream that could not 
# be converted to Unicode characters must be converted to U+FFFD 
# REPLACEMENT CHARACTER code points.

(Unless I'm misunderstanding what you mean?)


> But why, then, do non-character NCRs turn into REPLACEMENT CHARACTER?

For consistency.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 10 June 2009 04:50:30 UTC