Binary Encoding for Crypto-conditions

Currently, our spec for cryptoconditions uses a totally custom binary
format inspired by protocol buffers.

Before we went any further, I wanted to stop and consider whether there is
a better format that we can use. So with this email I want to share my
research and recommendation.

TL;DR We can get most of the awesomeness of ASN.1 without any of the suck
of DER, by using ASN.1 with OER, which incidentally is very similar to our
current custom format. (For instance, the encoding for 0-127 byte long
VARBYTES is identical.)


First off, what type of binary format are we looking for?

It should be:

- Canonical (one correct way to encode things)
- Reasonably simple to implement
- Unlikely to cause bugs
- Well defined with good documentation
- Compact
- Fast to encode/decode
- Standard

What formats are out there?

- ASN.1 BER/DER (X.690) - This is what most crypto uses today. However, to
quote Tony, it is "almost universally reviled among implementers." It's
relatively inefficient size-wise. Probably my biggest gripe with it is that
it doesn't support unsigned integers, which is just annoying and a common
source for bugs. [1]
- ASN.1 PER (X.691) - Super-densely packed encoding. Used for mobile
communications etc.
- ASN.1 OER (X.696) - One of the newest ASN.1 Encoding Rules. Optimized for
very fast encoding/decoding, simplicity and small size (not as small as
PER, but smaller than BER/DER.)
- ASN.1 ECN (X.692) - Basically: Use ASN.1, but define your own encoding
rules or change existing ones in any way you like.
- RLP - Used by Ethereum. Not a standard, but very minimal.
- XDR (RFC 4506) - Similar idea as ASN.1, but there aren't as good of tools
available. For instance, I couldn't find a generic web-based XDR
encoder/decoder playground.

Notable mentions:
- Macaroons is working on a custom format, but it seems very specific to
their use case. [3]

After looking at these options, I noticed that ASN.1 sticks out as being
very widely adopted in industry. It is used in many crypto standards
including PKCS and X.509, but it also powers network protocols like SNMP,
LDAP and H.323, cellphone networks (from GPRS all the way up to LTE) and
even aerospace telecommunication networks.

I noticed that there are lots of tools available for ASN.1 - I downloaded a
free Eclipse-based ASN.1 editor [4] which would validate my syntax
definitions in real time and color them nicely. Then I could take my syntax
definition and paste it into the ASN playground [5] which would instantly
encode examples in DER, PER, XER and OER so I could see what they look
like. It is awesome to have that sort of tooling available both for this
spec and for future extensions.

If you're interested to learn more about ASN.1, I can highly recommend
these slides: [8]

The ability to autodetect inconsistencies in the spec and to generate
examples without relying on the reference implementation you're trying to
test has been incredibly helpful for me. And using a standardized protocol
makes sense because then we can keep our spec purely about
crypto-conditions instead of having to define some new binary format as
well. Because of these reason, ASN.1 seems like the way to go, but which
encoding should we use?

PER is too complicated and we don't need that level of packing. We do want
it to be a canonical encoding, so that leaves DER, CANONICAL-OER or
something custom with an ECN description.

DER has the advantage that there are implementations for a lot of
languages. But it is relatively complicated and inefficient.

OER is very fast to encode/decode [6] and easy to implement. [7] There
aren't as many existing implementations, but because it's so simple and so
close to our previous format - which we were already writing parsers for -
we don't really lose anything. Also, there is very little I'd want to
change about OER, so I think we should just use it as is, without any
ECN-type customizations.

So based on my research, I would recommend ASN.1 with OER as the binary
format for crypto-conditions.

I'm happy to start working on a pull request to update the spec and code,
but please let me know if you have any other information regarding binary
encoding that I missed.

Big thanks to Tony Arcieri for comments on DER and pointing out the
Macaroons effort and Jehan Tremback for pointing out the RLP format.

[1]
http://stackoverflow.com/questions/12860226/oddity-when-encoding-large-integers-using-asn-1
[2] https://github.com/ethereum/wiki/wiki/RLP
[3] https://github.com/rescrv/libmacaroons/blob/master/doc/format.txt
[4] http://www.asnlab.org/asndt/overview
[5] http://asn1-playground.oss.com/
[6]
https://www.researchgate.net/publication/275249582_Performance_Analysis_of_Existing_16092_Encodings_v_ASN1
[7]
http://www.oss.com/asn1/resources/books-whitepapers-pubs/Overview%20of%20OER.pdf
[8]
http://www.ieee802.org/802_tutorials/2010-11/Alessandro%20Triglia%20-%20ASN.1%20Tutorial%20Dallas%20-%2020101108T1701.pptx

Received on Wednesday, 30 March 2016 02:37:05 UTC