Re: JSON headers - No: CBOR headers

--------
In message <928f8531-6573-caf6-50c1-1672cc020959@gmx.de>, Julian Reschke writes
:

Tl;DR:  I would prefer CBOR and base{64|85}(CBOR) to JSON


>2) People abusing it to add comments to JSON (by choosing a member name 
>for comments, and repeating it).

That is one of the places where I went "Doh?" when I first saw JSON,
who the heck defines a data-interchange format with no support for
comments ?!

The answer, of course, is that JSON is not a data-interchange format,
it is a subset of the JavaScript source code syntax, and they didn't
include comments in that subset.

JSON being "source compatible", has been correctly critized for
being a wide open invitation for unsafe data-ingestion practices,
with plenty of real world instances to validate the argument.

We can take it as an absolutely certainty, that if we choose JSON
serialization, too many people will just "eval" it into Javascript,
opening them wide for malicius point-edits[1] and hostile actors.

This is of course not our first priority for deciding, but it counts
clearly against JSON.

>> This discussion may be a bit off-topic for the HTTP WG, but I think it
>> is important to understand JSON when using it in HTTP.
>
>Absolutely; and the conclusion might well be that we won't use JSON on 
>the wire.

Carstens CBOR presentation from IETF94 is very good background here,
and as a compentent data-interchange format CBOR (RFC7049) will get
my vote over JSON any time.

The problem with CBOR is that there does not seem to be a "CAOR"
we can use with H1[2], but maybe that is a good thing.

Eyeballing it, the size of base85(CBOR(obj)) or even base64(CBOR(obj))
seems to acceptably close to JSON.

An implementation would use the same CBOR serdes code for both H1
and H2, with a trivial gloss of baseXX serdes on top for H1.

I like that, in particular I like that it makes H1<-->H2 interop
trivial, streamable, reliable and debuggable.

Security-wise it is also preferable, because it forces everybody
to use a proper serdes code[3], and it permantently closes the door
to unsafe hacks such as 's/q=[0-9.]*/q=0/'

CBOR gets my vote.

Poul-Henning

[1] My security friends would point to general principle, that one
    should avoid serializations which can be "point-edited" to change
    semantic content.  (Text-book example:   Computer printed cheques
    filling the amount field with non-space: "****1000$*".).

    Such "point-edited" HTTP headers are in widespread use in H1,
    for instance "nnoCection:" and similar hacks, but they are also
    used on the other side of the colon by smut-filters and "privacy
    controls", typically mangling Location:, (Set-)Cookie: and in some
    cases the URL.

    H2 and HPACK makes that shortcut a lot longer, but I very much
    suspect that these devices may retain their "application logic"
    and just deserialize, mangle and reserialize the headers.

[2] CBOR does have a "Diagnostic Notation" which renders it to text, but
    it is not meant to be a data-interchange format, and as a debugging
    tool includes information that is surplus to (our) requirements.

[3] At least until CBOR becomes valid syntax in INTERCAL.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

Received on Tuesday, 12 July 2016 08:22:31 UTC