Re: numeric character references and Unicode surrogate pairs: part of my review of 8 The HTML syntax

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 20 Aug 2007 15:59:02 +0200
Message-ID: <46C99E26.6030405@gmx.de>
To: public-html WG <public-html@w3.org>

Cameron McCormack wrote:
> Robert Burns:
>>> I believe this is not consistent with existing browser behavior. That is  
>>> that while surrogate pairs, expressed as pairs of numeric character  
>>> references, are not supposed to resolve to the non-BMP character,  
>>> browsers do it anyway.
> Anne van Kesteren:
>> Do you have any tests to demonstrate that?
> Here’s one:
>   data:text/html,%26%23xD800%3B%26%23xDC00%3B
> Shows as a single U+10000 character in Firefox and Opera 9.23,
> at least.

Out of curiosity: how are you testing this? Over here FF displays one 
question mark, and if a copy/paste that into Notepad, I seem to get two 
UCS-2 characters...

Best regards, Julian
Received on Monday, 20 August 2007 13:59:22 UTC

