- From: Poul-Henning Kamp <phk@phk.freebsd.dk>
- Date: Tue, 12 Jul 2016 08:22:03 +0000
- To: Julian Reschke <julian.reschke@gmx.de>
- cc: Carsten Bormann <cabo@tzi.org>, Willy Tarreau <w@1wt.eu>, Yanick Rochon <yanick.rochon@gmail.com>, Phil Hunt <phil.hunt@oracle.com>, HTTP Working Group <ietf-http-wg@w3.org>
-------- In message <928f8531-6573-caf6-50c1-1672cc020959@gmx.de>, Julian Reschke writes : Tl;DR: I would prefer CBOR and base{64|85}(CBOR) to JSON >2) People abusing it to add comments to JSON (by choosing a member name >for comments, and repeating it). That is one of the places where I went "Doh?" when I first saw JSON, who the heck defines a data-interchange format with no support for comments ?! The answer, of course, is that JSON is not a data-interchange format, it is a subset of the JavaScript source code syntax, and they didn't include comments in that subset. JSON being "source compatible", has been correctly critized for being a wide open invitation for unsafe data-ingestion practices, with plenty of real world instances to validate the argument. We can take it as an absolutely certainty, that if we choose JSON serialization, too many people will just "eval" it into Javascript, opening them wide for malicius point-edits[1] and hostile actors. This is of course not our first priority for deciding, but it counts clearly against JSON. >> This discussion may be a bit off-topic for the HTTP WG, but I think it >> is important to understand JSON when using it in HTTP. > >Absolutely; and the conclusion might well be that we won't use JSON on >the wire. Carstens CBOR presentation from IETF94 is very good background here, and as a compentent data-interchange format CBOR (RFC7049) will get my vote over JSON any time. The problem with CBOR is that there does not seem to be a "CAOR" we can use with H1[2], but maybe that is a good thing. Eyeballing it, the size of base85(CBOR(obj)) or even base64(CBOR(obj)) seems to acceptably close to JSON. An implementation would use the same CBOR serdes code for both H1 and H2, with a trivial gloss of baseXX serdes on top for H1. I like that, in particular I like that it makes H1<-->H2 interop trivial, streamable, reliable and debuggable. Security-wise it is also preferable, because it forces everybody to use a proper serdes code[3], and it permantently closes the door to unsafe hacks such as 's/q=[0-9.]*/q=0/' CBOR gets my vote. Poul-Henning [1] My security friends would point to general principle, that one should avoid serializations which can be "point-edited" to change semantic content. (Text-book example: Computer printed cheques filling the amount field with non-space: "****1000$*".). Such "point-edited" HTTP headers are in widespread use in H1, for instance "nnoCection:" and similar hacks, but they are also used on the other side of the colon by smut-filters and "privacy controls", typically mangling Location:, (Set-)Cookie: and in some cases the URL. H2 and HPACK makes that shortcut a lot longer, but I very much suspect that these devices may retain their "application logic" and just deserialize, mangle and reserialize the headers. [2] CBOR does have a "Diagnostic Notation" which renders it to text, but it is not meant to be a data-interchange format, and as a debugging tool includes information that is surplus to (our) requirements. [3] At least until CBOR becomes valid syntax in INTERCAL. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.
Received on Tuesday, 12 July 2016 08:22:31 UTC