Re: JSON: characters below \u0020 from Steven Pemberton on 2016-11-30 (public-xformsusers@w3.org from November 2016)

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Wed, 30 Nov 2016 09:36:26 +0100
To: "public-xformsusers@w3.org" <public-xformsusers@w3.org>, "Erik Bruchez" <erik@bruchez.org>
Message-ID: <op.yrqfi0c4smjzpq@steven-aspire-s7>

You're right, it's not clear. When I wrote this, I meant 9, A and D to be  
included "characters and escapes that have no equivalent XML character"  
was intended to mean that you don't have to specially encode, 9 A and D.

I propose changing it to read "and most characters of the form \uxxxx less  
than \u0020". Does that cover it sufficiently? Or should we list 9, A and  
D explicitly?

Steven

On Tue, 29 Nov 2016 20:22:49 +0100, Erik Bruchez <erik@bruchez.org> wrote:

> All,
>
> Currently, the spec says:
>
>    "characters and escapes that have no equivalent XML character (\b,  
> \f, and >characters of the form \uxxxx less than \u0020) are transformed  
> by adding >\uE000 to them."
>
> The sentence contradict itself because in XML, the following characters  
> below >\u0020 are supported:
>
> - \u0009
> - \u000A
> - \u000D
>
> So we should clarify this, and I suggest that we allow keeping the 3  
> characters >above. Consider this piece of JSON:
>
>    {
>      "firstName": "John",
>      "lastName": "Smith",
>      "address": "1000 Main Street\nNew York, NY"
>    }
>
> The `\n` in "address" translates to a newline `\u000A`. If we translate  
> it to >`\uE00A`, it becomes unnecessary inconvenient to handle the  
> newline on the XML >side.
>
> Conversely, when converting back from XML to JSON, a `\u000A` in the XML  
> must >translate into `\n` in the resulting JSON.
>
> For reference this was raised by a user. Details here:
>
>    https://github.com/orbeon/orbeon-forms/issues/3012
>
> Feedback welcome.
>
> -Erik

Received on Wednesday, 30 November 2016 08:37:10 UTC