Re: "send data using the Web Socket" and UCS-2

On Tue, Jun 2, 2009 at 1:23 PM, Jeff Walden <jwalden@mit.edu> wrote:
> The specification should say what happens when WebSocket.postMessage(data)
> is called where data is not structurally correct UTF-16 -- lone surrogates,
> backwards surrogates, and any similar structural errors I might have
> forgotten.  The IETF protocol specification implicitly assumes the data can
> be encoded as UTF-8 when this may not be the case.  I mentioned similar
> issues in #whatwg recently for DOM APIs in general[0], possibly to be
> handled by WebIDL.  Since this instance of that problem explicitly requires
> interpretation of a bogus DOMString and can't be described as storing a
> sequence of opaque 16-bit numbers for later retrieval, I think it's worth
> raising this concern specially, and I would like precise behavior specified
> before this proceeds to a finalized state.

Yes, I don't see how we could handle this in WebIDL, other than
defining that all DOMStrings must be structurally correct UTF-16.
However that would be prohibitively expensive since we would have to
add checks in many many places.

/ Jonas

Received on Thursday, 4 June 2009 07:37:48 UTC