Re: Abbreviation form for HTTP JSON Header Field Values? from Kazuho Oku on 2016-01-27 (ietf-http-wg@w3.org from January to March 2016)

From: Kazuho Oku <kazuhooku@gmail.com>
Date: Wed, 27 Jan 2016 13:16:10 +0900
To: Julian Reschke <julian.reschke@gmx.de>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Stefan Eissing <stefan.eissing@greenbytes.de>
Message-ID: <CANatvzx5GNNZhTuAmMKJ69+90Ty3cO7zrV4uJZ0UtbJy2vyPKw@mail.gmail.com>
Thank you for your response.

2016-01-24 23:16 GMT+09:00 Julian Reschke <julian.reschke@gmx.de>:
> Hi Kazuho,
>
> I think this is a very interesting idea, as it will indeed address the
> verbosity problem.
>
> Right now, the JFV spec is agnostic about the JSON payload. This means that
> a header field definition based on JVF could indeed use the transforms you
> mentioned; it just would have to define them itself.

I agree.

However, my preference goes to having it defined within the JVF draft,
of under the premise that such requirement exists in multiple areas.
Let me explain why.

If we are to define the transformations ourselves, it would either be
described as part of that specifies the attributes, or will be
assigned a dedicated chapter describing how it should be transformed.

Consider a case that defines two types:  X, Y.

If we have abbreviation form defined outside the definition of the
attributes, we can state:

* X is a hash; required parameter "s" is a string
* Y is a hash; required parameter "x" is an array of X
* abbreviation form can be used

If we are to include the transformation into the definition of the
attributes, we would need to state something like:

* X is either:
 * a hash; required parameter "s"  is a string
 * a string
* Y is either:
 * a hash; required parameter "x" is X or an array of X
 * if not a hash, it is X or an array of X
* if X is a hash, then Y should always be encoded as a hash

To me the latter is too complicated.  And for the latter, things will
get worse as the number of types defined in the spec. increases.

The reason why just intermixing a single transformation rule (rule 1)
makes the attribute definition is because if a value can be collapsed
depends on the type of the value that contains the value.  In other
words, it is inappropriate to define the transformation rule as part
of the attribute definition.

In other words, my understanding that even if we were to define the
transformations within each spec. that uses JVF, we would need to
create a dedicated chapter independent from the attribute definition.

That means that we would be seeing similar definition of abbreviation
forms in every spec., that uses JVF (with some serious use of
abbreviation).

If that is the case, I believe it would be better to have the
transformations defined within JVF so that it can be more easily and
widely adopted.

> That being said, if there's interest in this we could of course move these
> transformation algorithms into the base spec, so they can be easily invoked,
> and field definitions do needlessly invent different ways to do the same
> thing. I'd probably make this step optional, though.
>
> Finally, could you provide an example where "rule 2" would help?

I see two reasons for having "rule 2".

First reason is that the abbreviation form will look simpler (and also
easy to understand).

Consider the following JSON:

    [ { "default-key": [ "default-value" ] } ]

With "rule 1" only, it will become:

   [ [ "default-value" ] ]

With both "rule 1" and "rule 2" applied, it will become:

  "default-value"

IMO the issue with only having rule 1 is that it looks strange.  I am
afraid people might wonder what the brackets are for.  Also, encoders
only encoding simple values (i.e. no need to specify optional hash
parameters) would be required to take care of how many brackets it
should emit.  Having "rule 2" solves these two issues.

The second reason is that it corresponds with "rule 1".  Considering
the fact that hash and array are the only two elements in JSON that
can be used for structuring, I think it would be natural to define
transformations for both of them.

> Best regards, Julian
>
>
>
> On 2016-01-22 08:27, Kazuho Oku wrote:
>>
>> Dear Mr. Julian F. Reschke,
>>
>> Thank you for writing the HTTP JFV draft.
>>
>> I love the concept, and would love to see it being used in all the
>> future header definitions once the JFV draft gets standardized.
>>
>> And regarding the draft, is there any work to introduce abbreviation form?
>>
>> I assume the biggest argument against JFV is that it cannot encode
>> simple things (i.e. objects mostly conveying default values) as simple
>> as in case we use tailor-made ABNF to define the syntax, and think
>> that having an abbreviation form defined in JFV (either as a
>> requirement or as an optional feature) will be a good thing to do.
>>
>> Specifically, I would like to see the following transformations defined:
>>
>> * rule 1) a single-element hash MAY be transformed to the value of the
>> single element if all of the following conditions are met:
>>   * the semantics state that the element is the only required element
>>   * the type of the element is not a hash
>>
>> * rule 2) a single-element array MAY be represented with the single
>> element if the following condition is met:
>>   * the type of the element is not an array
>>
>> With the rules, I believe it is possible reduce the redundancy imposed
>> by using JSON, while preserving the good aspects of JFV.
>>
>> In the rest of the document, I will describe what made me believe such
>> abbreviation form should be defined and the impact on the decoder for
>> having the abbrevation form defined within the spec.  Examples using
>> popular HTTP headers are also provided.
>>
>>
>> My Use-Case
>> ---
>>
>> The reason I would like to see abbreviation forms in JFV comes from
>> the discussion with Stefan on how to define the `cache-digest` header.
>> Now, the disagreement between Stefan and me (please refer to
>> https://lists.w3.org/Archives/Public/ietf-http-wg/2016JanMar/0154.html)
>> is whether if we should encode a required component (in our case, the
>> digest value) outside of the attribute key-value pairs (option A), or
>> if we should define the component as a required element of the
>> attributes (option B).
>>
>> Examples below show the headers encoded using the two options.  Each
>> three semantically corresponds to the other three.
>>
>> ```
>> Option A:
>>    cache-digest: base64encodedgcs
>>    cache-digest: base64encodedgcs; path="/foo"
>>    cache-digest: base64encodedgcs; path="/foo", anothergcs;
>> path="/foo"; type="if-modified-since"
>>
>> Option B:
>>    cache-digest: fresh="base64encodedgcs"
>>    cache-digest: fresh="base64encodedgcs"; path="/foo"
>>    cache-digest: fresh="base64encodedgcs";
>> if-modified-since="anothergcs"; path="/foo"
>> ```
>>
>> As is shown in the examples, in the case of `cache-digest` header,
>> option A yields a more concise output in simple cases, while option B
>> yields smaller output in complex cases due to the fact that it is
>> possible to contain more than one GCS in a single set of atttributes.
>>
>> In other words, this is a trade-off issue; and per my understanding
>> the current draft of JFV does not address the problem.
>> The draft always enforces the use of key-value pairs in this case.
>> Therefore, I anticipate that in the future we might see similiar
>> arguments for not using JFV when we are to define a new header.
>>
>> Going back to the case of cache-digest header, ideally I would like
>> see the entry of the header to have the following characteristics:
>>
>> * a cache-digest entry conveys one or more digests, together with
>> attributes that limit the scope of the contained digest (e.g. domain,
>> path)
>> * a digest is a base64-encoded bit-field of various algorithms (e.g.
>> GCS, Bloom filter), representing cache resources that fall into
>> certain category (e.g. fresh, stale-with-if-modified-since-header)
>>
>> And the characteristics lead to a header like the following when JFV
>> is used, which look even more redundant for the simple cases.
>>
>> ```
>> cache-digest: {
>>                  "digest" : [
>>                    { "value" : "base64encodedgcs" },   // omitted
>> defaults: category=fresh, encoding=gcs
>>                  ]
>>                }
>>
>> cache-digest: {
>>                  "digest" : [
>>                    { "value" : "base64encodedgcs" },   // omitted
>> defaults: category=fresh, encoding=gcs
>>                  ],
>>                  "path"   : "/foo"
>>                }
>>
>> cache-digest: {
>>                  "digest" : [
>>                    { "value" : "base64encodedgcs" },   // omitted
>> defaults: category=fresh, encoding=gcs
>>                    { "value" : "anothergcs",
>> category="if-modified-since" }  // omitted defaults: encoding=gcs
>>                  ],
>>                  "path"   : "/foo"
>>                }
>> ```
>>
>> But if the aforementioned transformations were permitted within the
>> JFV spec, the headers will become much simpler with the
>> transformations applied:
>>
>> ```
>> cache-digest: "base64encodedgcs"
>> cache-digest: {
>>                  "digest": "base64encodedgcs",
>>                  "path"  : "/foo"
>>                }
>> cache-digest: {
>>                  "digest": [
>>                    "base64encodedgcs",
>>                    { "value" : "anothergcs", category="if-modified-since"
>> }
>>                  "path"  : "/foo"
>>                }
>> ```
>>
>> In this example, the transformations yield a header representation
>> comparable to tailor-made ABNF (option A) for the simplest cases (as
>> shown in the first of the three headers).
>>
>>
>> Impact on the Decoding-side
>> ---
>>
>> Now that it has been shown that (at least in our case) defining the
>> transformations yield to a more concise output for simple use-cases,
>> let's move on to consider how large the impact of implementing such
>> transformations will be on the decoding-side.
>>
>> Actually, the application-specific part of the decoder does not become
>> complex at all by adding support for the abbreviation form.
>>
>> This is because the reverse transformations can be implemented at the
>> points where type checks were performed.  All the thing that the
>> decoder needs to do for supporting the abbreviation form is to convert
>> the value to the non-abbreviated type if it is not, instead of
>> throwing a decoding error.  As an example, the diff that adds support
>> for the abbreviation form to the decoder that handles the
>> previously-defined cache-digest header can be found at
>>
>> https://gist.github.com/kazuho/c84fa23b26c606e55533/revisions#diff-b071c075a9788be737d99e9159092db8.
>>
>>
>> Other Examples
>> ---
>>
>> Consider using JFV for encoding the `Content-Type` header.
>> Without abbreviation form, it would look like:
>>
>>    Content-Type: { "type": "text/html" }
>>    Content-Type: { "type": "text/html", "charset": "utf-8" }
>>
>> With the abbrevation form, it can be like:
>>
>>    Content-Type: "text/html"
>>    Content-Type: { "type": "text/html", "charset": "utf-8" }
>>
>> Consider using JFV for encoding the `Accept-Encoding` header.
>> Without abbreviation form, it would look like:
>>
>>    Accept-Encoding: { "encoding": "compress" }, { "encoding": "gzip" }
>>    Accept-Encoding: { "encoding": "gzip" }, { "encoding": "identity",
>> "q": 0.5 }, { "encoding": "*", "q": 0 }
>>
>> With the abbreviation form, it can be like:
>>
>>    Accept-Encoding: "compress", "gzip"
>>    Accept-Encoding: "gzip", { "encoding": "identity", "q": 0.5 }, {
>> "encoding": "*", "q": 0 }
>>
>> As can be seen from the examples, if we support abbreviation form in
>> JFV it is possible to encode simple headers as simple as they are now.
>>
>>
>> Conclusion
>> ---
>>
>> Please consider supporting some kind of abbreviation form in JFV; I
>> believe that it would make JFV more attractive to the users, since
>> with abbreviation it is possible to offer both simplicity (for simple
>> cases) and extensibility (of JSON) at the same time.
>>
>



-- 
Kazuho Oku
Received on Wednesday, 27 January 2016 04:16:42 UTC