- From: <bugzilla@jessica.w3.org>
- Date: Fri, 28 Mar 2014 11:52:34 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24104
--- Comment #2 from Anne <annevk@annevk.nl> ---
I tested this:
<meta charset=windows-1252>
<form action=http://software.hixie.ch/utilities/cgi/test-tools/echo>
<input name=a> <script> document.querySelector("input").value = "\ud801"
</script>
<input type=submit>
</form>
Gecko does U+FFFD, Chrome gives back U+D801 (encoded as per <form> error mode
as windows-1252 can express neither).
Now if set the encoding to utf-8 both Gecko and Chrome emit U+FFFD (as utf-8
bytes percent-encoded).
utf-16 results in the same as utf-8 as expected.
So either each encoder's handler needs to catch the surrogate range and return
error with U+FFFD (Gecko) or not (Chrome). Gecko's behavior is slightly saner I
suspect. I'll fix utf-8 and utf-16 to do this right away. Not sure who to
consult how we should change the rest.
--
You are receiving this mail because:
You are on the CC list for the bug.
Received on Friday, 28 March 2014 11:52:36 UTC