Re: [Json] BOMs

On Tue, Nov 19, 2013 at 4:31 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:

> * Tatu Saloranta wrote:
> >Dominant Java implementations support UTF-16 with BOM; either directly or
> >through Java's Reader implementations that handle BOMs.
> >String concatenation case seems irrelevant, since BOMs are not included in
> >in-memory representation anyway, as opposed to byte stream serialization.
>
> HTTP implementations cannot correctly determine whether an entity body
> is text in a single character encoding and if so what that encoding is,
> accordingly the dominant API deals in byte[] arrays, not text Strings;
> furthermore, many programming languages default to byte[] arrays for
> string literals. That often combines into forms of
>
>   byte[] json = sprintf('{"x": %s, "y": %s}', GET(...), GET(...));
>
> which works fine if all three byte[] arrays are UTF-8 encoded and use
> no Unicode signature, which is the case 99% of the time.
>

My point was just that although it appears that many scripting languages
may not deal with BOM properly, same is not true on all platforms. Proper
JSON APIs on JVM do accept both String and byte[] based input; byte[] being
preferred since it is more efficient, and reliably with auto-detection,
assuming that -- as per JSON specification -- the only single-byte encoding
used is UTF-8.

-+ Tatu +-

Received on Tuesday, 19 November 2013 18:31:16 UTC