On Tue, 28 Feb 2012 13:05:37 +0100, Jonas Sicking <jonas@sicking.cc> wrote: >> If we can't U+FFFD unpaired surrogates on paste, I agree it makes sense >> to >> U+FFFD them in APIs. If the only way to get them is a JS escape, then an >> exception seems OK. > > People use JS strings to handle binary data. This is something that > has worked since the dawn of JS and is something that I believe is > defined to work in recent ECMAScript specs. > > I don't think that we can start restricting that and try to enforce > that JS-strings always contain valid UTF16. Right. > So I think our only option is to make all APIs which does UTF16->UTF8 > conversion explicitly define how to deal with invalid surrogates. Sure, I don't suggest we leave it undefined. > My > preference would be to deal with them by encoding them to U+FFFD for > the same reason that we let the HTML parser do error recovery rather > than XML-style draconian error handling. I'm not really opposed to making APIs use U+FFFD instead of exception, but I'm not entirely convinced, either. If people use binary data in strings and want to use them in these APIs, U+FFFDing lone surrogates is going to "silently" scramble their data. Why is this better than throwing an exception? -- Simon Pieters Opera SoftwareReceived on Tuesday, 28 February 2012 12:57:42 GMT
This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:50 GMT