- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Wed, 11 Jan 2012 19:22:56 -0500
- To: public-webapps@w3.org
On 1/11/12 6:03 PM, Charles Pritchard wrote: > Is there any instance in practice where DOMString as exposed to the > scripting environment is not implemented as a unicode string? I don't know what you mean by that. The point is, it's trivial to construct JS strings that contain arbitrary sequences of 16-bit units (using fromCharCode or \u escapes). Nothing anywhere in JS or the DOM per se enforces that strings are valid UTF-16 (which is the way that an actual Unicode string would be encoded as a JS string). > I realize that internally, DOMString may be implemented as a 16 bit > integer + length; Not just internally. The JS spec and the DOM spec both explicitly say that this is what strings are: an array of 16-bit integers. > Browsers do the same thing with WindowBase64, though it's specified as > DOMString, in practice (as the notes say), it's unicode. > http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob If you look at the actual processing model, you take the input array of 16-bit integers, throw if any is not in the set { 0x2B, 0x2F, 0x30 } union [0x41,0x5A] union [0x61,0x6A] and then treat the rest as ASCII data (which at that point it is). It defines this in terms of "Unicode" but that's just because any JS string that satisfies the above constraints can be considered a "Unicode" string if one wishes. > Web Storage, also, only works with unicode. I'm not familiar with the relevant part of Web Storage. Can you cite the relevant part please? -Boris
Received on Thursday, 12 January 2012 00:23:29 UTC