- From: Poul-Henning Kamp <phk@phk.freebsd.dk>
- Date: Sun, 28 May 2023 07:28:11 +0000
- To: Martin J. Dürst <duerst@it.aoyama.ac.jp>
- cc: Mark Nottingham <mnot@mnot.net>, Roy Fielding <fielding@gbiv.com>, Tommy Pauly <tpauly@apple.com>, HTTP Working Group <ietf-http-wg@w3.org>
-------- Martin J. Dürst writes: Adding base64 encoding to the table: > Legacy UTF-8 proposed expansion base64 b64expansion > ASCII 1 1 1 1 1.33 1.33 > Latin+Accents, e.g. Polish 1 ~1.5 ~2 2 2 2 > Arabic/Cyrillic/... 1 2 6 6 2.66 2.66 > Indic scripts,... 1 3 9 9 4 4 > Chinese/Japanese/... 2 3 9 4.5 4 2 > > So some text in an Indic or South Asian Script gets expanded by a factor > of 9 when compared to a legacy singlebyte encoding. Base64 does not penalize non-western languages nearly as much. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.
Received on Sunday, 28 May 2023 07:28:24 UTC