[Bug 16157] WebSocket shouldn't throw SyntaxError on unpaired surrogates

https://www.w3.org/Bugs/Public/show_bug.cgi?id=16157

--- Comment #4 from Jonas Sicking <jonas@sicking.cc> 2012-03-16 07:33:16 UTC ---
How is this different from the "draconian" error handling the XML parsers are
required to do and which many people, you included, has argued strongly
against.

The problem with throwing for unpaired surrogates is that easy-to-make
data-dependent mistakes produces very fatal results. I.e. if for example you
want to send string data in smaller chunks a very easy "mistake" to make would
be to simply chop up the JS-string into 10k sized chunks and send each
separately. This will generally work great, however in languages which produces
a lot of surrogates this will fail 50%-67% of the time.

If we could make it throw consistently then I agree it would have been a more
reasonable strategy. But I can't think of a way to not make this very data
dependent which means that it's likely to not fail on developers machines, but
fail in the real world.

And yes, putting in a replacement character also results in destroyed data.
However in the example stated above, having one destroyed character every 10k
of data should be a low enough error rate that the message is still
understandable to a human. Just like the layout errors produced by a missing
end tag likely produces a page understandable to humans.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 16 March 2012 07:33:23 UTC