Re: Bag of Data Anti-pattern (was: Re: DID Spec Closure Process: Harmonize our two Cryptographic Key Material Proposals?) from =Drummond Reed on 2018-01-04 (public-credentials@w3.org from January 2018)

From: =Drummond Reed <drummond.reed@evernym.com>
Date: Thu, 4 Jan 2018 00:37:58 -0800
To: Christopher Lemmer Webber <cwebber@dustycloud.org>
Cc: Manu Sporny <msporny@digitalbazaar.com>, Credentials Community Group <public-credentials@w3.org>
Message-ID: <CAAjunnabsL=Sw1OHTUrh5_Agr-ob5t1fK_5NbBFR62jm_Yy7nw@mail.gmail.com>
I'm late coming to this thread (just arrived back from vacation last
night—been catching up all day), so just few overall comments to grease the
skids for what I believe will be a very productive discussion of this topic
on tomorrow's DID Spec Closure call (10AM PT - all are welcome - see Susan
Bradford's message to the list with details).

First, I've always chafed a little bit at the using the "bag of keys"
analogy to talk about the proposed "keys" array in Proposal #1 of the
Cryptographic
Key Material Proposals for the DID Specification
<https://docs.google.com/document/d/13fp7V3v1nBuhxTI55Al8KLG2kyxFthBz-Ush-ZL58KA/edit#>
doc.

Call it a "bag of keys" and comparing it to a braindead "data:..." field is
a little like saying a flat-file database is stupid because it doesn't do
joins. An array of the right kind of metadata to describe a particular data
structure (like a cryptographic key) can be every bit as useful as a simple
flat-file database. It's only a dumb "bag of keys" if you ignore the
proposed design, which: a) makes every key description object uniquely
globally addressable, and b) proposes a single property of each key
description object ("type") to describe all the required metadata to select
the proper key for any particular usage.

The global addressability means that the DID document owner does not have
to treat the whole array as a single monolithic authorization structure,
but can create and uniquely address every single member of that array. So
the DID document owner can be as granular as needed with permissions over
key description objects.

Secondly, I'm seeing a lot of support, including Joe's proposal, for
breaking out "purpose" from the rest of the type metadata describing a
member of this array. On tomorrow's call I want to ask two related
questions:

   1. Can any of the experienced crypto engineers on the CG give a common
   example where the same key is used for more than one purpose?
   2. Will it be more or less confusing to a developer to have purpose
   described in a separate property or to have it be part of the type name in
   a single type property?

If the answer to #1 is "yes" and the answer to #2 is "less confusing", then
I can support breaking out "purpose" into a separate property. If not, we
should seriously consider the benefits of giving developers a single
property to match exactly the type of key they should be looking for any
particular application (and giving application developer a single URI
or JSON-LD name to use to specify the type of key to use with their
particular application).

Thirdly, I support breaking out encodings as proposed by Dave so that the
same key could have multiple encodings.

Lastly—and I'm going to push really hard on this (having just finished
Steven Levy's 18-year-old book Crypto
<https://en.wikipedia.org/wiki/Crypto_(book)>, which reads like it could
have been written yesterday)—IMHO the problem we are trying to solve with
this particular topic is KEY MANAGEMENT. As in the key management
<https://en.wikipedia.org/wiki/Key_management> needed to support public key
cryptography <https://en.wikipedia.org/wiki/Public-key_cryptography>. To be
really crisp about this, let me quote the introduction to the Wikipedia
article:

*Key management* is the name of management of cryptographic keys
<https://en.wikipedia.org/wiki/Key_(cryptography)> in a cryptosystem
<https://en.wikipedia.org/wiki/Cryptosystem>. This includes dealing with
the generation, exchange, storage, use, crypto-shredding
<https://en.wikipedia.org/wiki/Crypto-shredding> (destruction) and
replacement of keys. It includes cryptographic protocol
<https://en.wikipedia.org/wiki/Cryptographic_protocol> design, key servers
<https://en.wikipedia.org/wiki/Key_server_(cryptographic)>, user
procedures, and other relevant protocols.[1]
<https://en.wikipedia.org/wiki/Key_management#cite_note-Turner-What-is-key-management-1>

Key management concerns keys at the user level, either between users or
systems. This is in contrast to key scheduling
<https://en.wikipedia.org/wiki/Key_scheduling>, which typically refers to
the internal handling of keys within the operation of a cipher.

Successful key management is critical to the security of a cryptosystem. It
is the more challenging side of cryptography
<https://en.wikipedia.org/wiki/Cryptography> in a sense that it involves
aspects of social engineering such as system policy, user training,
organizational and departmental interactions, and coordination between all
of these elements, in contrast to pure mathematical practices that can be
automated.


This DOES NOT mean I am opposed to DID documents including other types of
"proofs" that can be used for authentication, such as the oft-mentioned
biometric proof (which, to my understanding, still works as a type of
cryptographic key). But it DOES mean that IF those types of data structures
ARE NOT KEYS, then we should not be trying to manage them in a data
structure designed to communicate public keys.

They should go in a different branch of the DID document graph.

That's all my thoughts for tonight—see you on the call tomorrow.

=Drummond




On Wed, Jan 3, 2018 at 12:02 PM, Christopher Lemmer Webber <
cwebber@dustycloud.org> wrote:

> Manu Sporny writes:
>
> > On 12/27/2017 02:16 AM, =Drummond Reed wrote:
> >> It would allow developers or applications who prefer "naive JSON" to
> >> use the DID document for basic key management with a simple array of
> >> keys described by type.
> >
> > Before proposing something, it's important to identify a
> > standards-making anti-pattern that we keep coming back to in this
> > discussion. Let's call it the "Bag of Data" anti-pattern for now, and it
> > goes something like this:
>
> [... snip ...]
>
> > Let's now apply this principle to what's being suggested in "Proposal
> > #1: Simple Flat Array of Key Description Objects":
> >
> > {
> >   "keys": [ ... ]
> > }
> >
> > This data structure raises the following questions:
> >
> > Q1. Are those keys the DID owns or any key that the DID document
> >     references?
> > Q2. Are all software applications allowed to add/remove any key in that
> >     array, or just a subset?
> > Q3. Can I put a biometric authentication mechanism in that array?
> > Q4. Can I specify a key reference into that array, or can I put
> >     complete key descriptions elsewhere in the document?
> >
> > As a developer, I could easily jump to the conclusion that:
> >
> > A1. All keys must go in this array, including descriptions of other
> >     people's keys.
> > A2. Any software application is allowed to manage any key in that array.
> > A3. Biometrics should not go in that array.
> > A4. I must put all keys in that array and nowhere else, even if they're
> >     buried deep in another subtree of the data structure.
>
> It may be useful to look at how having a more general key bucket can
> lead to some real security vulnerabilities.  First off, I really
> recommend reading this article:
>
>   https://sandstorm.io/news/2015-05-01-is-that-ascii-or-protobuf
>
> In that article a scenario is given where you use your key to respond to
> an arbitrary-looking random string in a challenge-response type system
> to prove that you are who you say you are.  Sounds good... except it
> turns out that that random looking string was actually a datastructure
> that was asking you to authorize a payment, and your client signs it.
>
>   "Oops, it appears you just signed a bank check after all – and you
>   were trying so hard not to!"
>
> We can easily imagine such attacks in our own domain.  Imagine someone
> is using their DID + key materials for the following purposes:
>
>  - In a peer to peer protocol where nodes are signing that yes, they
>    really did see another node produce this object at a particular time
>  - An authorization of payment, or anything else serious (surgery,
>    sharing of private information, selling your house, etc)
>
> Without distinguishing between the function of keys, it would be
> possible to trick a node into performing the former with a key in their
> general keys bucket but to use that object for the latter purpose.
> A verifier being presented the object in the context of the latter
> purpose (but which was signed by the node under the former purpose) may
> say "oh yes, all looks good here... go forward with that payment /
> surgery / sale of house!"
>
> That seems like a serious issue to consider?
>
>  - Christopher Lemmer Webber
>
>
Received on Thursday, 4 January 2018 08:38:24 UTC