Re: [whatwg/encoding] Allow conformant implementations to support non-UTF-8 encodings (#144)

2 points to consider:

1) UTF8 is registered as an alternate encoding model that supports standard return to other ISO-IR encoding model charsets.  So it appears Unicode already requires these ALL still be supported for decoding web server sourced content, and encoding when a browser is saving decoded text extracted from this web content locally. In China it's required encoding to GB-18030 be supported also for such saves. While a variety of C0 sets can be mapped into the /x00 to /x1F code space, all the registered ones support ESC as the current edition of ISO-2022 requires, which includes that standard return sequence. 

2) At the time UCS4 replaced UCS2 as the encoding model for Unicode it was known this range might also be inadequate for future purposes and then Unicode would be obsolete. It was hoped this inadequacy would be evident later than sooner so went ahead anyways with the change. However, some recent websites, some with millions of users, have provided facilities for creating, and usage in text, of what are, logically, user-defined emoji or CJK Han-type glyph code points, with the net result that time is NOW. Sure, it will twitch about for a few more years in its death throes, but it is terminal. Eventually people will start proposing these as extensions to their script's lexicon and if Unicode doesn't assign them a code point for interchange, some other registry will be created that will and probably subsume all of what Unicode has done so far.

So, to me betting all your money on Unicode being the encoding of choice for future web browsers is a non-starter. For similar reasons GB-18030 is not in the race either. ISO-2022 has its own versions of myopia, but range isn't one of them so it's still in the running; a 4-byte 94-n encoding has 70+ times the range of code points UCS4 supports. There may be a UCS6 or UCS8, but I wouldn't expect these to be backwards compatible. Some other candidates still under development may also emerge from the bowels of Adobe's or IBM's development labs, or other company, too.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/144#issuecomment-399118966

Received on Thursday, 21 June 2018 14:16:22 UTC