- From: Simon Pieters <simonp@opera.com>
- Date: Tue, 28 Feb 2012 13:57:03 +0100
- To: "Jonas Sicking" <jonas@sicking.cc>
- Cc: "Arun Ranganathan" <aranganathan@mozilla.com>, "Glenn Maynard" <glenn@zewt.org>, "Eric U" <ericu@google.com>, public-webapps@w3.org
On Tue, 28 Feb 2012 13:05:37 +0100, Jonas Sicking <jonas@sicking.cc> wrote: >> If we can't U+FFFD unpaired surrogates on paste, I agree it makes sense >> to >> U+FFFD them in APIs. If the only way to get them is a JS escape, then an >> exception seems OK. > > People use JS strings to handle binary data. This is something that > has worked since the dawn of JS and is something that I believe is > defined to work in recent ECMAScript specs. > > I don't think that we can start restricting that and try to enforce > that JS-strings always contain valid UTF16. Right. > So I think our only option is to make all APIs which does UTF16->UTF8 > conversion explicitly define how to deal with invalid surrogates. Sure, I don't suggest we leave it undefined. > My > preference would be to deal with them by encoding them to U+FFFD for > the same reason that we let the HTML parser do error recovery rather > than XML-style draconian error handling. I'm not really opposed to making APIs use U+FFFD instead of exception, but I'm not entirely convinced, either. If people use binary data in strings and want to use them in these APIs, U+FFFDing lone surrogates is going to "silently" scramble their data. Why is this better than throwing an exception? -- Simon Pieters Opera Software
Received on Tuesday, 28 February 2012 12:57:42 UTC