- From: Brendan Eich <brendan@mozilla.com>
- Date: Mon, 20 Feb 2012 12:32:38 -0800
- To: Allen Wirfs-Brock <allen@wirfs-brock.com>
- CC: Gavin Barraclough <barraclough@apple.com>, public-script-coord@w3.org, Anne van Kesteren <annevk@opera.com>, mranney@voxer.com, es-discuss discussion <es-discuss@mozilla.org>
Allen Wirfs-Brock wrote: > > On Feb 20, 2012, at 10:52 AM, Brendan Eich wrote: > >> Allen Wirfs-Brock wrote: >> ... >>> Another way to express what I see as the problem with what you are >>> proposing about imposing such string semantics: >>> >>> Could the revised ECMAScript be used to implement a language that >>> had similar but not identical semantic rules to those you are >>> suggested for ES strings. My sense is that if we went down the path >>> you are suggesting, such a implementation would have to use binary >>> data arrays for all of its internal string processing and could not >>> use ES string functions to process them. >> >> If you mean a metacircular evaluator, I don't think so. Can you show >> a counterexample? >> >> If you mean a UTF-transcoder, then yes: binary data / typed arrays >> are required. That's the right answer. > > Not necessarily, metacircular...it could be support for any language > that imposes different semantic rules on string elements. In that case, binary data / typed arrays, definitely. > You are essentially saying that a compiler targeting ES for a language > X that includes a string data type that does not confirm to your > rules (for example, by allowing occurrences of surrogate code points > within string data) First, as a point of order: yes, JS strings as full Unicode does not want stray surrogate pair-halves. Does anyone disagree? Second, binary data / typed arrays stand ready for any such not-full-Unicode use-cases. > could not use ES strings as the target representation of its string > data type. It also could not use the built-in ES string functions in > the implementation of language X's built-in functions. Not if this hypothetical source language being compiled to JS wants other than full Unicode, no. Why is this a problem, even hypothetically? Such a use-case has binary data and typed arrays standing ready, and if it really could use String.prototype.* methods I would be greatly surprised. > It could not leverage any optimizations that a ES engine may apply to > strings and string functions. Emscripten already compiles LLVM source languages (C, C++, and Objective-C at least) to JS and does a very good job (getting better day by day). The utility of string function today (including uint16 indexing and length) is immaterial. Typed arrays are quite important, though. > Also, values of X's string type can not be directly passed in foreign > calls to ES functions. Etc. Emscripten does have a runtime that maps browser functionailty exposed to JS to the guest language. It does not AFAIK need to encode surrogate pairs in JS strings by hand, let alone make pair-halves. /be
Received on Monday, 20 February 2012 20:33:05 UTC