- From: John Boyer <boyerj@ca.ibm.com>
- Date: Sun, 5 Jun 2011 15:46:45 -0700
- To: "Forms WG" <public-forms@w3.org>
- Message-ID: <OFD2FA27F3.56D8B266-ON882578A4.00744D64-882578A6.007D20FF@ca.ibm.com>
Hi everyone,
It would help to get some urgency and focus on discussing this on the list
so we can get some decisions on our next call. Although this isn't really
"my" issue, there seems to be a sudden spike in interest in implementing
conversions and getting the word out on "how it should be done", and so
I've gotten involved just to make sure our interests are represented...
but that means we have to have a solid, clear, complete solution...
I presented our approach to an internal IBM task force, amended by the
many issues I raised in the thread below, and a few more issues popped up
based on their feedback.
10) We need to be able to handle json rooted by an array, not just an
object, e.g. [ 10, 20 ].
11) We need to be able to handle anonymously named objects and arrays at
any level, not just the root level, e.g. [[1,2], [3,4]]
Here's what this could look like:
<data name="" type="array">
<data name="" array="true" type="array">
<data array="true" name="" type="number">1</data>
<data array="true" name="" type="number">2</data>
</data>
<data name="" array="true" type="array">
<data array="true" name="" type="number">3</data>
<data array="true" name="" type="number">4</data>
</data>
</data>
12) We need to be able to distinguish various kinds of emptiness, not just
null, and we need to be able to distinguish the emptiness indication from
the type. For example, you can have emptiness meaning null (which we
current suggest representing with type=null), but how do we represent
emptiness meaning empty object or empty array?
{ a:[] } or { a: {}}
Seems we need something more like empty="null | object | array | string |
number | boolean" to explain what emptiness means if the element is empty.
Then, it would be possible to subsequently assign a type to indicate what
it should become if it becomes non-empty. Some XML applications will have
a secondary source of information that describes this, and we don't want a
type assignment (e.g. type="null") getting in the way.
13) It could be worthwhile using a hex encoding escape mechanism for
illegal chars in names. Maybe use minus instead of underscore to
separate, because minus will be used less frequently in names. Then,
you'll have to espape the escaping character. But, if you have json names
that are similar except for some illegal characters, then the tag names
will still be unique even if this is not strictly necessary due to
preserving the name in a name attribute.
14) What about those characters that are illegal in XML? Can we devise an
escaping mechanism to preserve them in names and content?
==================================
Here are some key points that are different than our current wiki content:
i) Use type="object|array|number|boolean|string"
ii) Use empty="null|object|array|number|boolean|string" with string the
default to help indicate the meaning of empty content for an element
iii) Use <data name=""> for anonymous (unnamed) values anywhere, including
root
iv) Use non-empty name attribute to record real names for json names that
don't match NCName
v) Use non-empty name attribute generally to indicate quotes should appear
around the json name
vi) Use an attribute to mark each array element, not just the start, so it
will be clear for any element whether it is part of an array.
and here's what some more examples would translate to based on these:
{a:[]} becomes
<data name="" type="object">
<a array="true" empty="array"></a>
</data>
{a:[""]} becomes
<data name="" type="object">
<a array="true"></a>
</data>
{ a : [[1,2],[3,4]] } becomes
<data name="" type="object">
<a array="true" type="array">
<data array="true" name="" type="number">1</data>
<data array="true" name="" type="number">2</data>
</a>
<a array="true" type="array">
<data array="true" name="" type="number">3</data>
<data array="true" name="" type="number">4</data>
</a>
</data>
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com
Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed:
http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw
From: John Boyer/CanWest/IBM@IBMCA
To: John Boyer/CanWest/IBM@IBMCA
Cc: "Nick Van den Bleeken" <Nick.Van.den.Bleeken@inventivegroup.com>,
"Forms WG" <public-forms@w3.org>, "Steven Pemberton"
<Steven.Pemberton@cwi.nl>
Date: 06/03/2011 12:25 AM
Subject: Re: JSON Instances
Sent by: public-forms-request@w3.org
P.P.S.
9) I do not think you can transliterate \b and \f because their codepoints
are 0x08 and 0x0C, which are not allowed by XML Char.
More generally, it sounds like all code points between 00 and 1F are out
of bounds except 09, 0A and 0D.
Does anyone know if JSON allows 00?
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com
Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed:
http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw
From: John Boyer/CanWest/IBM@IBMCA
To: John Boyer/CanWest/IBM@IBMCA
Cc: "Nick Van den Bleeken"
<Nick.Van.den.Bleeken@inventivegroup.com>, "Forms WG"
<public-forms@w3.org>, "Steven Pemberton" <Steven.Pemberton@cwi.nl>
Date: 06/02/2011 09:27 PM
Subject: Re: JSON Instances
Sent by: public-forms-request@w3.org
P.S.
7) Why should the root element for the XML corresponding to the JSON data
be <json>? Why not <data>?
8) The second bullet point that contains the encoding instructions for the
variable "name" should be further decomposed into a second level bullet
point list.
Thanks,
JB
From: John Boyer/CanWest/IBM@IBMCA
To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
Cc: "Nick Van den Bleeken"
<Nick.Van.den.Bleeken@inventivegroup.com>, "Forms WG"
<public-forms@w3.org>
Date: 06/02/2011 04:08 PM
Subject: Re: JSON Instances
Sent by: public-forms-request@w3.org
Even if the problem were only about numbers and booleans, using a bind
would make it a lot harder to roundtrip back to json, compared with
decorating the data itself.
I also agree that the array case pretty much quashes the idea.
More generally, it would be preferable if our JSON => XML => back to JSON
conversion strategy didn't rely on anything outside of XML. If our
conversion relied on another XForms construct, like bind, then people
outside of XForms could not reuse.
I have several other questions and suggestions related to round-tripping
the JSON:
1) Add some way to tell whether the name part of the JSON should have
quote marks. Right now it is clear that you need quote marks if the name
includes non-NMCHAR characters. In this case, you get a name attribute on
the XML tag. So maybe a way to always signal use of quote marks is to put
a name attribute, like this:
{"size": 50} ==> <json><size name="size" type="number">50</size></json>
==> {"size": 50}
The converter becomes really easy too. Use name attr if given and put the
attr value in quotes, otherwise use the element name, not in quotes.
Finally, I think if you take this approach, then there would not be much
point in debating whether we should use something better than just
underscores for the illegal chars, right?
2) I think type="null" is a bit underpowered. I think you really mean
type="object" because you're just trying to distinguish that the empty
content means null rather than the string "".
By the way, I recommend against using xsi:nil because it has to correspond
to something being nillable="true" in a schema, and it must be manually
changed to xsi:nil="false" if the element becomes non-empty.
3) You ask whether the type attr should be replaced with xsi:type. I'd
recommend against. It seems better to separate the issue of converting
JSON => XML from the issue of improving the XForms processing of the
resultant XML. It would always be possible for an XForms author to add an
XForms bind whose nodeset uses an xpath predicate to select nodes with a
particular type assignment and then assign a type MIP to those nodes to
attach a particular schema datatype, e.g.
<xf:bind nodeset="/descendant::*[type='number']" type="xsd:double"/>
By the way, it does look like javascript number and xsd:double use the
same 64-bit IEEE definition, so better to leave this flexible in case the
form author wants to be more restrictive, e.g. restrict to integer inputs.
Finally, use of xsi:type would then require us to add the ugly xmlns:xsi
namespace declaration to the json element.
4) Attaching starts="array" seems underpowered. Suppose I have a
particular node and I need to know whether it is part of an array? Why
not attach array="true" to each element from an array? Or would there be
any value in setting the attribute array equal to the element name? Would
there be a benefit to authors of being able to say
nodeset="*[array='size']" in order to grab all the nodes in the size array
separately from array elements that might be at the same hierarchic level?
One might think you could achieve the same effect with
nodeset="size[array='true']", so maybe the boolean is enough.
5) Is it just a wiki problem that is producing ?? for the translation of
escaped characters? If so, I suggest using a hex notation, e.g. \b to
0x08, \f to 0x0C, \n to 0x0A, \r to 0x0D, and \t to 0x09.
6) Can you update the wiki to indicate what illegal XML characters you
might be talking about? Seems it will be hard to decide what to do about
the characters without having the research to indicate what they are.
Maybe there are just a few?
Thank you,
John M. Boyer, Ph.D.
Distinguished Engineer, IBM Forms and Smarter Web Applications
IBM Canada Software Lab, Victoria
E-Mail: boyerj@ca.ibm.com
Blog: http://www.ibm.com/developerworks/blogs/page/JohnBoyer
Blog RSS feed:
http://www.ibm.com/developerworks/blogs/rss/JohnBoyer?flavor=rssdw
From: "Steven Pemberton" <Steven.Pemberton@cwi.nl>
To: "Steven Pemberton" <Steven.Pemberton@cwi.nl>, "Nick Van den
Bleeken" <Nick.Van.den.Bleeken@inventivegroup.com>
Cc: "Forms WG" <public-forms@w3.org>
Date: 06/01/2011 07:04 AM
Subject: Re: JSON Instances
Sent by: public-forms-request@w3.org
The reason they are there is to allow serialization to roundtrip the data.
That might work for numbers and boolean, but I don't see how it would work
for arrays. (But I may be wrong).
Steven
On Wed, 01 Jun 2011 11:10:41 +0200, Nick Van den Bleeken
<Nick.Van.den.Bleeken@inventivegroup.com> wrote:
> Steven,
>
> Couldn't we use auto generated binds that attach the type information to
> the nodes for that?
>
> Regards,
>
> Nick van den Bleeken
>
>
> On 30 May 2011, at 14:41, "Steven Pemberton" <Steven.Pemberton@cwi.nl>
> wrote:
>
>> I should note a slight difference with what we had earlier agreed that
>> dawned on me while firming it up, that got in the way of
round-tripping.
>>
>> In the transformation of
>> {"size": 50} and {"size": "50"}
>> you can't tell the difference if you transform both to
>> <json><size>50</size></json>
>>
>> So I've use the type attribute to (arbitrarily) mark the numeric case:
>>
>> <json><size type="number">50</size><json>
>>
>> Similarly with the boolean and null cases.
>>
>> Steven
>>
>>
>> On Fri, 27 May 2011 16:51:50 +0200, Steven Pemberton
>> <Steven.Pemberton@cwi.nl> wrote:
>>
>>> I have rewritten the JSON section, according to my action item.
>>>
>>> http://www.w3.org/MarkUp/Forms/wiki/Json
>>>
>>> Comments gladly received.
>>>
>>> Steven
>>
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>
> ________________________________
>
> Inventive Designers' Email Disclaimer:
> http://www.inventivedesigners.com/email-disclaimer
Received on Sunday, 5 June 2011 22:47:12 UTC