- From: Joshua Bell <jsbell@chromium.org>
- Date: Mon, 13 Aug 2012 09:08:24 -0700
- To: Jonas Sicking <jonas@sicking.cc>
- Cc: whatwg@lists.whatwg.org
Sorry if this is a dupe; I replied to this from my phone and an incorrect address, and my earlier reply isn't showing in the archives. On Fri, Aug 10, 2012 at 9:16 PM, Jonas Sicking <jonas@sicking.cc> wrote: > The spec now contains the following text: > > "NOTE: Because only UTF encodings are supported, and because of the > algorithm used to convert a DOMString to a sequence of Unicode > characters, no input can cause the encoding process to emit an encoder > error." > > This is not correct. A DOMString is not a sequence of Unicode > characters, it's a UTF16 encoded string (this is per EcmaScript). Thus > it can contain unpaired surrogates and so the encoding process can > result in encoder errors. > > As I've suggested earlier, I think we should deal with this by simply > emitting Unicode replacement characters for these encoder errors (i.e. > for unpaired surrogates). > Already accounted for. Note the phrase: and because of the algorithm used to convert a DOMString to a sequence of > Unicode characters This refers to the normative text that generates a sequence of Unicode code points from a DOMString by reference to the algorithm in WebIDL [1], which handles unpaired surrogates etc. This informative text should say "Unicode code points" rather than "Unicode characters", though. Fixing now and referenced [1] even in the note. [1] http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode
Received on Monday, 13 August 2012 16:08:55 UTC