- From: Mark Callow <callow_mark@hicorp.co.jp>
- Date: Thu, 22 Mar 2012 11:12:25 +0900
On 22/03/2012 04:42, Anne van Kesteren wrote: > ... > > As for the API, how about: > > enc = new Encoder("euc-kr") > string1 = enc.encode(bytes1) > string2 = enc.encode(bytes2) > string3 = enc.eof() // might return empty string if all is fine > > And similarly you would have > > dec = new Decoder("shift_jis") > bytes = dec.decode(string) > > Or alternatively you could have a single object that exposes both > encode() and decode() and tracks state for both: > > enc = new Encoding("gb18030") > bytes1 = enc.decode(string1) > string2 = enc.encode(bytes2) This has encode and decode reversed from my understanding. I regard the string (wide-char) as the canonical form and the bytes as the encoded form. This view is reflected in the widely used terminology "charset encodings" which refers to the likes of euc-kr and shift_jis. Regards -Mark
Received on Wednesday, 21 March 2012 19:12:25 UTC