W3C home > Mailing lists > Public > public-exi@w3.org > April 2016

AW: AW: EXI for JSON structure

From: Peintner, Daniel (ext) <daniel.peintner.ext@siemens.com>
Date: Wed, 27 Apr 2016 11:31:49 +0000
To: "Stephen D. Williams" <sdw@lig.net>, "public-exi@w3.org" <public-exi@w3.org>
Message-ID: <D94F68A44EB1954A91DE4AE9659C5A980FE918A9@DEFTHW99EH1MSX.ww902.siemens.net>
Hi Stephen,

Thanks for your input.
I agree that the proposed map change provides some advantages.

That said, it does also have some downsides like
* XML schema validation is not working as good, and
* the generic EXI streams are less compact

I believe we need further investigations.

When looking at JSON to XML converters in the wild [1, 2] many solutions transform a map key to an element just like proposed. Also, most of the solution don't take into account JSON keys with characters that form not a valid NCName which is required for element names (e.g., spaces in keys) and fail in such use-cases.
Note: EXI does not have this issue.

Thanks,

-- Daniel

[1] http://www.freeformatter.com/json-to-xml-converter.htm

[2] www.utilities-online.info/xmltojson/<http://www.utilities-online.info/xmltojson/>



________________________________

Von: Stephen D. Williams [sdw@lig.net]
Gesendet: Freitag, 22. April 2016 19:47
An: public-exi@w3.org
Betreff: Re: AW: EXI for JSON structure

This is also better because the xpath / path is then essentially the same for both JSON and XML:

map.firstname
map.lastname
map.age

Rather than map#firstname or map.string#firstname vs. map.firstname.
BTW, when you have something like 'key', it probably ought to be 'id' to match # usage.

As for IDs, Javascript has two circumstances: valid Javascript identifiers (map.firstname) and strings used as an array index (map["First Name!"]).  The former should conform to XML limitations.  The latter doesn't have to.  Perhaps URLencoding or similar could be used to transform the latter as equivalent to string quoting.

sdw

On 4/22/16 6:36 AM, Peintner, Daniel (ext) wrote:

All,

I have been experimenting a bit how we could improve the EXI for JSON format in a way to allow for using dedicated schemas. We have had already some proposals. Please find attached another solution I think could be worthwhile looking at.

The general concept remains the same. The only change applies to the <map/> element.

An example map used to be like this

<map xmlns="http://www.w3.org/2015/EXI/json"<http://www.w3.org/2015/EXI/json>>
    <string key="firstname">Daniel</string>
    <string key="lastname">Peintner</string>
    <number key="age">36</number>
    <string key="car">Audi</string>
</map>


while a proposal could be to change it to this:


<map xmlns="http://www.w3.org/2015/EXI/json"<http://www.w3.org/2015/EXI/json>>
    <firstname>
        <string>Daniel</string>
    </firstname>
    <lastname>
        <string>Peintner</string>
    </lastname>
    <age>
        <number>36</number>
    </age>
    <car>
        <string>Audi</string>
    </car>
</map>

Any element in a map is a key (e.g., <firstname>) while the value is the content (e.g., <string>Daniel</string>)
Doing so allows us to actually create dedicated XML schemas which in many use-cases are very beneficial.

Please find attached the generic schema (schema-for-json-V2.xsd) and a dedicated XML schema (e.g., example0.xsd, or example1.xsd) for the actual instances.

Pros:
* allows for dedicated XML Schema
  --> much more efficient while the XML instance remains the same

Cons:
* generic schema less compact due to any
* Schema less restrictive due to <any> element in generic EXI for JSON schema
  --> EXI-encoded may be even invalid even in STRICT mode
      (e.g., firstname may not contain any valid inner element such as <string>)
* for unusual key-names the "internal" XML instance may not be valid anymore
  <strange key><number>36</number><strange key>
  --> Transformation to JSON is still fine!


Any thoughts or comments?

Note: I also propose to slightly change the <array> element so that an array may only consist of the same array entries (see schema changes for this proposal).

Thanks,

-- Daniel



-----Urspr¨¹ngliche Nachricht-----
Von: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Gesendet: Mittwoch, 10. Februar 2016 10:42
An: Takuki Kamiya; public-exi@w3.org<mailto:public-exi@w3.org>
Betreff: AW: EXI for JSON structure

Hi Taki, all,

I think we have consensus about the ultimate goal. We just need to find a "good" way how we can make "EXI for JSON" schema-aware.


Changing

<j:map key="link">
  <j:string key="manager">Boss</j:string>
  <j:string key="subordinates">worker</j:string>
</j:map>

to

<j:map link="">
  <j:string manager="">Boss</j:string>
  <j:string subordinates="">worker</j:string> </j:map>

assumes that JSON keys are valid attribute names or that "EXI for JSON" documents never get transformed to XML. On the one hand this might seem acceptable on the other hand I am not sure.
Any thoughts?

Thanks,

-- Daniel





________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com<mailto:tkamiya@us.fujitsu.com>]
Gesendet: Dienstag, 9. Februar 2016 03:34
An: Peintner, Daniel (ext); public-exi@w3.org<mailto:public-exi@w3.org>
Betreff: RE: EXI for JSON structure

Hi Daniel,

I agree that we ultimately should define JSON schema to EXI grammar mappings.

If JSON schema to EXI grammar mapping is defined, I think one important aspect to consider is how to make it feasible to reuse as much existing implementation infrastructure as possible.

In defining the mapping, it would be beneficial to existing implementations if the mapping can be implemented by simply converting JSON schema to similarly structured XML schema (or to in-memory schema model). However, not all implementations do not need to follow this path.

Say we have the following JSON structure where both name and title are optional.

{
¡¡¡°name¡±: ¡°Taro¡±,
¡¡¡°title¡±: ¡°poet¡±
}

The corresponding pseudo-XML schema (XML schema-like with some violation of XML schema rules) is:

<xs:element name="map">
  <xs:complexType>
    <xs:element name="string" minOccurs="0">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:string">
            <xs:attribute name="key">
              <xs:simpleType>
                <xs:restriction base="xs:string">
                  <xs:enumeration value="name"/>
                </xs:restriction>
              </xs:simpleType>
            </xs:attribute>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
    <xs:element name="string" minOccurs="0">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:string">
            <xs:attribute name="key">
              <xs:simpleType>
                <xs:restriction base="xs:string">
                  <xs:enumeration value="title"/>
                </xs:restriction>
              </xs:simpleType>
            </xs:attribute>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
  </xs:complexType>
</xs:element>

Even if a processor successfully read this pseudo schema, the two
AT("string") have different types, so the proto-grammar cannot be processed by the rules defined in EXI 1.0.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America



-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com<&smime=14.3.123.2mailto:daniel.peintner.ext@siemens.com><mailto:&smime=14.3.123.2mailto:daniel.peintner.ext@siemens.com>]
Sent: Thursday, February 04, 2016 4:42 AM
To: Takuki Kamiya; public-exi@w3.org<mailto:public-exi@w3.org>
Subject: AW: EXI for JSON structure

Hi Taki,

thank you for sharing your experiments with us.

EXI for JSON uses a very generic XML schema that allows representing any JSON document. That said I do see your usecase of applying a more detailed schema though!

The JSON snippet you referred to was

         "link" : {
             "manager" : "Boss"
             "subordinates" : "worker"
          }

by saying the "link" property in the above is a map that consists of either or both of "manager" and "subordinates".

The JSON snippet would be converted to XML as follows:

    <j:map key="link">
        <j:string key="manager">Boss</j:string>
        <j:string key="subordinates">worker</j:string>
    </j:map>

However, building an XML schema that deals with the properties you shared does not seem to work.
The approach you proposed provides some benefit but does not handle all issues:

* It is not schema-valid and causes "cos-nonambig" issues.
  (We do not know how XML schema processors deal with it)

* It assumes that keys are valid tag-names.
  e.g., a key "fo ba" with spaces might cause issues as attribute name,
  at least when serialized to XML

* It still does not express properties of JSON such as a map is unordered, meaning
  that manager and subordinates can appear in any order
  (I think this is much more complicated if you deal with more possible entries)

* It tries to solve an issue in XML schema that might be solved in JSON schema.
  The use-case is JSON and at best people might have an JSON schema document which
  should be used to generate EXI grammars

I think in a schema-informed case what we really want to see is something like this (expressed in EXI grammars):

link0:
  SE("manager") link1
  SE("subordinates") link2
  EE

link1:
  SE("subordinates") link2
  EE

link2:
  EE

manager0:
  CH manager1

manager1:
  EE

subordinates0:
  CH subordinates1

subordinates1:
  EE

This is very different from the generic approach.
That said, I do have a hard time to see how we can achieve what we are looking for.

In my experiments I failed to create an XML instance that could be used with a generic schema and which is also feasible for a more appropriate schema (which is not in conflict with XML schema rules such as "cos-nonambig" or "cos-element-consistent") :-(

I am glad if we can find a solution though!

Thanks,

-- Daniel

________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com<mailto:tkamiya@us.fujitsu.com>]
Gesendet: Dienstag, 2. Februar 2016 02:17
An: public-exi@w3.org<mailto:public-exi@w3.org>
Betreff: EXI for JSON structure

Hi,

I have looked at an example to exercise how EXI for JSON [1] works with schemas.

Below is an example snippet of a JSON consisting of one record "person".

      {
        "person" : {
          "id" : "Boss",
          "name" : {
            "family" : "Smith",
            "given" : "Bill"
          },
          "email" : "smith@foo.com"<mailto:smith@foo.com>,
          "YearsOfService" : 20,
          "weight" : 175.4,
          "birthday" : 1955-03-24,
          "link" : {
            "manager" : "Boss"
            "subordinates" : "worker"
          }
        }
      }

"link" property in the above is a map that consists of either or both of "manager"
and "subordinates".

Currently, EXI for JSON carries names as the value of "key" attribute.
This makes it difficult to generate EXI grammar, because in both "manager"
and "subordinates" cases, the name is contained in the value of "key"
attribute. EXI spec expects the distinction be explicitly made in the terminal symbols. (See section 8.5.4.2.2 [2] in EXI spec.)

I therefore would like to suggest the following structure in XML.

        <map link="">
          <string manager="">Boss</string>
          <string subordinates="">worker</string>
        </map>

A structure like the one above would permit the rule in EXI spec section 8.5.4.2.2 to work.

The corresponding schema would be as follows:

<xs:element name="map"><!-- link -->
  <xs:complexType>
    <xs:sequence>
      <xs:element name="string" minOccurs="0"><!-- subordinates -->
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute name="subordinates" use="required">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:enumeration value=""/>
                  </xs:restriction>
                </xs:simpleType>
              </xs:attribute>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="string" minOccurs="0"><!-- manager -->
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute name="manager" use="required">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:enumeration value=""/>
                  </xs:restriction>
                </xs:simpleType>
              </xs:attribute>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="link" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value=""/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>


Thank you,

[1] https://www.w3.org/TR/2016/WD-exi-for-json-20160128/<&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/<&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/%3C&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/>>
[2] https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols>>

Takuki Kamiya
Fujitsu Laboratories of America







--
Stephen D. Williams sdw@lig.net<mailto:sdw@lig.net> stephendwilliams@gmail.com<mailto:stephendwilliams@gmail.com> LinkedIn: http://sdw.st/in

V:650-450-UNIX (8649) V:866.SDW.UNIX V:703.371.9362 F:703.995.0407
AIM:sdw<thismessage:/> Skype:StephenDWilliams<thismessage:/> Yahoo:sdwlignet<thismessage:/> Resume: http://sdw.st/gres

Personal: http://sdw.st<http://sdw.st/> facebook.com/sdwlig twitter.com/scienteer
Received on Wednesday, 27 April 2016 11:32:24 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 27 April 2016 11:32:25 UTC