RE: EXI for JSON structure

Hi,

I did some further experiments regarding the structure and encoding methods
for EXI4JSON.

Attached zip file contains three JSON files.
(The zip also contains some more files.)

"personnel_three.json is the JSON file that contains three records.
(personnel_two.json and personnel_one.json have two and one records,
 respectively.)

With personnel_three.json, the encoding result is:

Current:        294 bytes
V2:             313 bytes
V2+EFG: 315 bytes (i.e. using Element Fragment Grammar instead of built-in grammar)

For V2 and V2_EFG, the schema used is schema-for-json-V2.xsd, and
the intermediate XML is personnel_three-V2.xml.

With personnel_two.json:

Current:        230 bytes
V2:             250 bytes
V2+EFG: 242 bytes

With personnel_one.json:

Current:        158 bytes
V2:             179 bytes
V2+EFG: 164 bytes

I observe that V2+EFG is suitable for single (or two) record(s) simple JSON
documents. It is very close to the current structure in terms of efficiency,
and yet enables custom schema definitions and simpler implementation
(i.e. no built-in grammar).

On the other hand, for JSON that contains more than a couple of records,
V2 (not using EFG) is good.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Takuki Kamiya [mailto:tkamiya@us.fujitsu.com]
Sent: Monday, May 09, 2016 4:09 PM
To: Peintner, Daniel (ext); public-exi@w3.org
Subject: RE: EXI for JSON structure

Hi,

I tried to encode the attached JSON into three form.

1. Current form resulted in 313 bytes in EXI.

2. The form proposed by Daniel resulted in 332 bytes.

and

3. The modified form of the above proposed by me resulted in 456 bytes.

If we can somehow let those unknown elements (e.g. <person> or <name>)
to use Element Fragment Grammar [1] instead of the built-in element grammar,
we would be able to achieve better compaction in the case of #2.

I wonder if it makes sense to introduce an option for that in EXI. That option
essentially dictates that grammar does not grow in any circumstances.

[1] http://www.w3.org/TR/2014/REC-exi-20140211/#informedElementFragGrammar


Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Takuki Kamiya [mailto:tkamiya@us.fujitsu.com]
Sent: Monday, May 02, 2016 2:19 PM
To: Peintner, Daniel (ext); public-exi@w3.org
Subject: RE: EXI for JSON structure

Hi Daniel and all,

One idea is to require the arbitrarily named elements to always to have
a xsi:type attribute type-casting the element content to one of the known
content model.

For instance,

<map xmlns="http://www.w3.org/2015/EXI/json">
    <firstname xsi:type="exi4json:stringContent">
        <string>Daniel</string>
    </firstname>
    <lastname xsi:type="exi4json:stringContent">
        <string>Peintner</string>
    </lastname>
    <age xsi:type="exi4json:numberContent">
        <number>36</number>
    </age>
    <car xsi:type="exi4json:numberContent">
        <string>Audi</string>
    </car>
</map>


We may also need to think about better encodings for
qname values in xsi:type. I think we explored some ideas for EXI2.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America


-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Sent: Friday, April 22, 2016 6:37 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: EXI for JSON structure

All,

I have been experimenting a bit how we could improve the EXI for JSON format in a way to allow for using dedicated schemas. We have had already some proposals. Please find attached another solution I think could be worthwhile looking at.

The general concept remains the same. The only change applies to the <map/> element.

An example map used to be like this

<map xmlns="http://www.w3.org/2015/EXI/json">
    <string key="firstname">Daniel</string>
    <string key="lastname">Peintner</string>
    <number key="age">36</number>
    <string key="car">Audi</string>
</map>


while a proposal could be to change it to this:


<map xmlns="http://www.w3.org/2015/EXI/json">
    <firstname>
        <string>Daniel</string>
    </firstname>
    <lastname>
        <string>Peintner</string>
    </lastname>
    <age>
        <number>36</number>
    </age>
    <car>
        <string>Audi</string>
    </car>
</map>

Any element in a map is a key (e.g., <firstname>) while the value is the content (e.g., <string>Daniel</string>)
Doing so allows us to actually create dedicated XML schemas which in many use-cases are very beneficial.

Please find attached the generic schema (schema-for-json-V2.xsd) and a dedicated XML schema (e.g., example0.xsd, or example1.xsd) for the actual instances.

Pros:
* allows for dedicated XML Schema
  --> much more efficient while the XML instance remains the same

Cons:
* generic schema less compact due to any
* Schema less restrictive due to <any> element in generic EXI for JSON schema
  --> EXI-encoded may be even invalid even in STRICT mode
      (e.g., firstname may not contain any valid inner element such as <string>)
* for unusual key-names the "internal" XML instance may not be valid anymore
  <strange key><number>36</number><strange key>
  --> Transformation to JSON is still fine!


Any thoughts or comments?

Note: I also propose to slightly change the <array> element so that an array may only consist of the same array entries (see schema changes for this proposal).

Thanks,

-- Daniel



-----Urspr¨¹ngliche Nachricht-----
Von: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com]
Gesendet: Mittwoch, 10. Februar 2016 10:42
An: Takuki Kamiya; public-exi@w3.org
Betreff: AW: EXI for JSON structure

Hi Taki, all,

I think we have consensus about the ultimate goal. We just need to find a "good" way how we can make "EXI for JSON" schema-aware.


Changing

<j:map key="link">
  <j:string key="manager">Boss</j:string>
  <j:string key="subordinates">worker</j:string>
</j:map>

to

<j:map link="">
  <j:string manager="">Boss</j:string>
  <j:string subordinates="">worker</j:string> </j:map>

assumes that JSON keys are valid attribute names or that "EXI for JSON" documents never get transformed to XML. On the one hand this might seem acceptable on the other hand I am not sure.
Any thoughts?

Thanks,

-- Daniel





________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Dienstag, 9. Februar 2016 03:34
An: Peintner, Daniel (ext); public-exi@w3.org
Betreff: RE: EXI for JSON structure

Hi Daniel,

I agree that we ultimately should define JSON schema to EXI grammar mappings.

If JSON schema to EXI grammar mapping is defined, I think one important aspect to consider is how to make it feasible to reuse as much existing implementation infrastructure as possible.

In defining the mapping, it would be beneficial to existing implementations if the mapping can be implemented by simply converting JSON schema to similarly structured XML schema (or to in-memory schema model). However, not all implementations do not need to follow this path.

Say we have the following JSON structure where both name and title are optional.

{
¡¡¡°name¡±: ¡°Taro¡±,
¡¡¡°title¡±: ¡°poet¡±
}

The corresponding pseudo-XML schema (XML schema-like with some violation of XML schema rules) is:

<xs:element name="map">
  <xs:complexType>
    <xs:element name="string" minOccurs="0">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:string">
            <xs:attribute name="key">
              <xs:simpleType>
                <xs:restriction base="xs:string">
                  <xs:enumeration value="name"/>
                </xs:restriction>
              </xs:simpleType>
            </xs:attribute>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
    <xs:element name="string" minOccurs="0">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:string">
            <xs:attribute name="key">
              <xs:simpleType>
                <xs:restriction base="xs:string">
                  <xs:enumeration value="title"/>
                </xs:restriction>
              </xs:simpleType>
            </xs:attribute>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
  </xs:complexType>
</xs:element>

Even if a processor successfully read this pseudo schema, the two
AT("string") have different types, so the proto-grammar cannot be processed by the rules defined in EXI 1.0.

Thank you,

Takuki Kamiya
Fujitsu Laboratories of America



-----Original Message-----
From: Peintner, Daniel (ext) [mailto:daniel.peintner.ext@siemens.com<&smime=14.3.123.2mailto:daniel.peintner.ext@siemens.com>]
Sent: Thursday, February 04, 2016 4:42 AM
To: Takuki Kamiya; public-exi@w3.org
Subject: AW: EXI for JSON structure

Hi Taki,

thank you for sharing your experiments with us.

EXI for JSON uses a very generic XML schema that allows representing any JSON document. That said I do see your usecase of applying a more detailed schema though!

The JSON snippet you referred to was

         "link" : {
             "manager" : "Boss"
             "subordinates" : "worker"
          }

by saying the "link" property in the above is a map that consists of either or both of "manager" and "subordinates".

The JSON snippet would be converted to XML as follows:

    <j:map key="link">
        <j:string key="manager">Boss</j:string>
        <j:string key="subordinates">worker</j:string>
    </j:map>

However, building an XML schema that deals with the properties you shared does not seem to work.
The approach you proposed provides some benefit but does not handle all issues:

* It is not schema-valid and causes "cos-nonambig" issues.
  (We do not know how XML schema processors deal with it)

* It assumes that keys are valid tag-names.
  e.g., a key "fo ba" with spaces might cause issues as attribute name,
  at least when serialized to XML

* It still does not express properties of JSON such as a map is unordered, meaning
  that manager and subordinates can appear in any order
  (I think this is much more complicated if you deal with more possible entries)

* It tries to solve an issue in XML schema that might be solved in JSON schema.
  The use-case is JSON and at best people might have an JSON schema document which
  should be used to generate EXI grammars

I think in a schema-informed case what we really want to see is something like this (expressed in EXI grammars):

link0:
  SE("manager") link1
  SE("subordinates") link2
  EE

link1:
  SE("subordinates") link2
  EE

link2:
  EE

manager0:
  CH manager1

manager1:
  EE

subordinates0:
  CH subordinates1

subordinates1:
  EE

This is very different from the generic approach.
That said, I do have a hard time to see how we can achieve what we are looking for.

In my experiments I failed to create an XML instance that could be used with a generic schema and which is also feasible for a more appropriate schema (which is not in conflict with XML schema rules such as "cos-nonambig" or "cos-element-consistent") :-(

I am glad if we can find a solution though!

Thanks,

-- Daniel

________________________________
Von: Takuki Kamiya [tkamiya@us.fujitsu.com]
Gesendet: Dienstag, 2. Februar 2016 02:17
An: public-exi@w3.org
Betreff: EXI for JSON structure

Hi,

I have looked at an example to exercise how EXI for JSON [1] works with schemas.

Below is an example snippet of a JSON consisting of one record "person".

      {
        "person" : {
          "id" : "Boss",
          "name" : {
            "family" : "Smith",
            "given" : "Bill"
          },
          "email" : "smith@foo.com",
          "YearsOfService" : 20,
          "weight" : 175.4,
          "birthday" : 1955-03-24,
          "link" : {
            "manager" : "Boss"
            "subordinates" : "worker"
          }
        }
      }

"link" property in the above is a map that consists of either or both of "manager"
and "subordinates".

Currently, EXI for JSON carries names as the value of "key" attribute.
This makes it difficult to generate EXI grammar, because in both "manager"
and "subordinates" cases, the name is contained in the value of "key"
attribute. EXI spec expects the distinction be explicitly made in the terminal symbols. (See section 8.5.4.2.2 [2] in EXI spec.)

I therefore would like to suggest the following structure in XML.

        <map link="">
          <string manager="">Boss</string>
          <string subordinates="">worker</string>
        </map>

A structure like the one above would permit the rule in EXI spec section 8.5.4.2.2 to work.

The corresponding schema would be as follows:

<xs:element name="map"><!-- link -->
  <xs:complexType>
    <xs:sequence>
      <xs:element name="string" minOccurs="0"><!-- subordinates -->
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute name="subordinates" use="required">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:enumeration value=""/>
                  </xs:restriction>
                </xs:simpleType>
              </xs:attribute>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
      <xs:element name="string" minOccurs="0"><!-- manager -->
        <xs:complexType>
          <xs:simpleContent>
            <xs:extension base="xs:string">
              <xs:attribute name="manager" use="required">
                <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:enumeration value=""/>
                  </xs:restriction>
                </xs:simpleType>
              </xs:attribute>
            </xs:extension>
          </xs:simpleContent>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="link" use="required">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value=""/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>
</xs:element>


Thank you,

[1] https://www.w3.org/TR/2016/WD-exi-for-json-20160128/<&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/<&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/%3C&smime=14.3.123.2https://www.w3.org/TR/2016/WD-exi-for-json-20160128/>>
[2] https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols<&smime=14.3.123.2https://www.w3.org/TR/2014/REC-exi-20140211/#eliminatingSymbols>>

Takuki Kamiya
Fujitsu Laboratories of America

Received on Monday, 16 May 2016 18:23:47 UTC