Re: Bag of Data Anti-pattern (was: Re: DID Spec Closure Process: Harmonize our two Cryptographic Key Material Proposals?) from Mike Lodder on 2018-01-04 (public-credentials@w3.org from January 2018)

From: Mike Lodder <mike.lodder@evernym.com>
Date: Thu, 4 Jan 2018 10:14:15 -0700
To: "=Drummond Reed" <drummond.reed@evernym.com>
Cc: Christopher Lemmer Webber <cwebber@dustycloud.org>, Manu Sporny <msporny@digitalbazaar.com>, Credentials Community Group <public-credentials@w3.org>
Message-ID: <CABW+ph_kKQk9O8-Ncg6-V4-x0H1B0UxxUUfETdnCDNNHU9oRqA@mail.gmail.com>
On Thu, Jan 4, 2018 at 1:37 AM, =Drummond Reed <drummond.reed@evernym.com>
wrote:

> I'm late coming to this thread (just arrived back from vacation last
> night—been catching up all day), so just few overall comments to grease the
> skids for what I believe will be a very productive discussion of this topic
> on tomorrow's DID Spec Closure call (10AM PT - all are welcome - see Susan
> Bradford's message to the list with details).
>
> First, I've always chafed a little bit at the using the "bag of keys"
> analogy to talk about the proposed "keys" array in Proposal #1 of the Cryptographic
> Key Material Proposals for the DID Specification
> <https://docs.google.com/document/d/13fp7V3v1nBuhxTI55Al8KLG2kyxFthBz-Ush-ZL58KA/edit#>
> doc.
>
> Call it a "bag of keys" and comparing it to a braindead "data:..." field
> is a little like saying a flat-file database is stupid because it doesn't
> do joins. An array of the right kind of metadata to describe a particular
> data structure (like a cryptographic key) can be every bit as useful as a
> simple flat-file database. It's only a dumb "bag of keys" if you ignore the
> proposed design, which: a) makes every key description object uniquely
> globally addressable, and b) proposes a single property of each key
> description object ("type") to describe all the required metadata to select
> the proper key for any particular usage.
>
> The global addressability means that the DID document owner does not have
> to treat the whole array as a single monolithic authorization structure,
> but can create and uniquely address every single member of that array. So
> the DID document owner can be as granular as needed with permissions over
> key description objects.
>
> Secondly, I'm seeing a lot of support, including Joe's proposal, for
> breaking out "purpose" from the rest of the type metadata describing a
> member of this array. On tomorrow's call I want to ask two related
> questions:
>
>    1. Can any of the experienced crypto engineers on the CG give a common
>    example where the same key is used for more than one purpose?
>
> This depends on the algorithm but as most algorithms are using elliptic
curves these days then there are only two I can think of: Signatures and
Key Agreements. In the case of key agreements I could see this used for
static certificate like public keys for bootstrapping a connection.
Otherwise its recommended to not use static keys in order to have forward
secrecy. Which means that all other uses are for signatures. Signatures can
be used for either authentication (proving something about you) or
authorization (proving you are allowed to perform a certain action like
read a file or update a database). RSA is the only algorithm that the
general populace would use for encryption. But if you're going to put a key
in the DID Doc for encryption its most likely going to be used for the same
purpose as key agreement for bootstrapping a connection as only the holder
of the corresponding private key can decrypt the message. Perhaps someone
would use it for one off messages but again its not recommended. So I
believe its ultimately going to be used for authentication or
authorization. I've always thought that keys in the DID document were going
to be used for authorization proof that the DID document can be updated by
a particular key. A private key should not be used by more than one person
as this opens the door to impersonation. Instead another private key should
be used by another party. If that party is going to be acting on my behalf,
then I should sign a permission slip or have some indication that they are
allowed to perform actions on my behalf signed by my private key. Similar
to how certificates currently work, the root CA signs an intermediate CA
who can then sign end user keys. I know the root CA is not directly
interacting with me, but I know the intermediate CA is allowed to perform
actions on their behalf.

>
>    1. Will it be more or less confusing to a developer to have purpose
>    described in a separate property or to have it be part of the type name in
>    a single type property?  If keys are used for anything other than
>    updating the DID document, I think this should be DID method specific
>    because it requires more details and how many details are required can be a
>    deep rabbit hole.
>
> If the answer to #1 is "yes" and the answer to #2 is "less confusing",
> then I can support breaking out "purpose" into a separate property. If not,
> we should seriously consider the benefits of giving developers a single
> property to match exactly the type of key they should be looking for any
> particular application (and giving application developer a single URI
> or JSON-LD name to use to specify the type of key to use with their
> particular application).
>
> Thirdly, I support breaking out encodings as proposed by Dave so that the
> same key could have multiple encodings.
>
> Lastly—and I'm going to push really hard on this (having just finished
> Steven Levy's 18-year-old book Crypto
> <https://en.wikipedia.org/wiki/Crypto_(book)>, which reads like it could
> have been written yesterday)—IMHO the problem we are trying to solve with
> this particular topic is KEY MANAGEMENT. As in the key management
> <https://en.wikipedia.org/wiki/Key_management> needed to support public
> key cryptography <https://en.wikipedia.org/wiki/Public-key_cryptography>.
> To be really crisp about this, let me quote the introduction to the
> Wikipedia article:
>
> *Key management* is the name of management of cryptographic keys
> <https://en.wikipedia.org/wiki/Key_(cryptography)> in a cryptosystem
> <https://en.wikipedia.org/wiki/Cryptosystem>. This includes dealing with
> the generation, exchange, storage, use, crypto-shredding
> <https://en.wikipedia.org/wiki/Crypto-shredding> (destruction) and
> replacement of keys. It includes cryptographic protocol
> <https://en.wikipedia.org/wiki/Cryptographic_protocol> design, key servers
> <https://en.wikipedia.org/wiki/Key_server_(cryptographic)>, user
> procedures, and other relevant protocols.[1]
> <https://en.wikipedia.org/wiki/Key_management#cite_note-Turner-What-is-key-management-1>
>
> Key management concerns keys at the user level, either between users or
> systems. This is in contrast to key scheduling
> <https://en.wikipedia.org/wiki/Key_scheduling>, which typically refers to
> the internal handling of keys within the operation of a cipher.
>
> Successful key management is critical to the security of a cryptosystem.
> It is the more challenging side of cryptography
> <https://en.wikipedia.org/wiki/Cryptography> in a sense that it involves
> aspects of social engineering such as system policy, user training,
> organizational and departmental interactions, and coordination between all
> of these elements, in contrast to pure mathematical practices that can be
> automated.
>
>
> This DOES NOT mean I am opposed to DID documents including other types of
> "proofs" that can be used for authentication, such as the oft-mentioned
> biometric proof (which, to my understanding, still works as a type of
> cryptographic key). But it DOES mean that IF those types of data structures
> ARE NOT KEYS, then we should not be trying to manage them in a data
> structure designed to communicate public keys.  When I hear biometric
> proof on a public ledger I cringe because what exactly does this mean. If
> it means by biometric data is open on the public, then anyone can copy and
> use it to impersonate me. If its not–which I believe is the case here, then
> it means its working like key and used for authentication or authorization
> and its no different than a regular key EXCEPT that rotating it can be more
> complex. Just like you say before, I believe DID documents are about key
> management so I go back to why I wouldn't want biometric data on a public
> ledger. I could be convinced otherwise but this is what I believe for now.
>
> They should go in a different branch of the DID document graph.
>
> That's all my thoughts for tonight—see you on the call tomorrow.
>
> =Drummond
>
>
>
>
> On Wed, Jan 3, 2018 at 12:02 PM, Christopher Lemmer Webber <
> cwebber@dustycloud.org> wrote:
>
>> Manu Sporny writes:
>>
>> > On 12/27/2017 02:16 AM, =Drummond Reed wrote:
>> >> It would allow developers or applications who prefer "naive JSON" to
>> >> use the DID document for basic key management with a simple array of
>> >> keys described by type.
>> >
>> > Before proposing something, it's important to identify a
>> > standards-making anti-pattern that we keep coming back to in this
>> > discussion. Let's call it the "Bag of Data" anti-pattern for now, and it
>> > goes something like this:
>>
>> [... snip ...]
>>
>> > Let's now apply this principle to what's being suggested in "Proposal
>> > #1: Simple Flat Array of Key Description Objects":
>> >
>> > {
>> >   "keys": [ ... ]
>> > }
>> >
>> > This data structure raises the following questions:
>> >
>> > Q1. Are those keys the DID owns or any key that the DID document
>> >     references?
>> > Q2. Are all software applications allowed to add/remove any key in that
>> >     array, or just a subset?
>> > Q3. Can I put a biometric authentication mechanism in that array?
>> > Q4. Can I specify a key reference into that array, or can I put
>> >     complete key descriptions elsewhere in the document?
>> >
>> > As a developer, I could easily jump to the conclusion that:
>> >
>> > A1. All keys must go in this array, including descriptions of other
>> >     people's keys.
>> > A2. Any software application is allowed to manage any key in that array.
>> > A3. Biometrics should not go in that array.
>> > A4. I must put all keys in that array and nowhere else, even if they're
>> >     buried deep in another subtree of the data structure.
>>
>> It may be useful to look at how having a more general key bucket can
>> lead to some real security vulnerabilities.  First off, I really
>> recommend reading this article:
>>
>>   https://sandstorm.io/news/2015-05-01-is-that-ascii-or-protobuf
>>
>> In that article a scenario is given where you use your key to respond to
>> an arbitrary-looking random string in a challenge-response type system
>> to prove that you are who you say you are.  Sounds good... except it
>> turns out that that random looking string was actually a datastructure
>> that was asking you to authorize a payment, and your client signs it.
>>
>>   "Oops, it appears you just signed a bank check after all – and you
>>   were trying so hard not to!"
>>
>> We can easily imagine such attacks in our own domain.  Imagine someone
>> is using their DID + key materials for the following purposes:
>>
>>  - In a peer to peer protocol where nodes are signing that yes, they
>>    really did see another node produce this object at a particular time
>>  - An authorization of payment, or anything else serious (surgery,
>>    sharing of private information, selling your house, etc)
>>
>> Without distinguishing between the function of keys, it would be
>> possible to trick a node into performing the former with a key in their
>> general keys bucket but to use that object for the latter purpose.
>> A verifier being presented the object in the context of the latter
>> purpose (but which was signed by the node under the former purpose) may
>> say "oh yes, all looks good here... go forward with that payment /
>> surgery / sale of house!"
>>
>> That seems like a serious issue to consider?
>>
>>  - Christopher Lemmer Webber
>>
>>
>


-- 
Mike Lodder
Senior Crypto Engineer
Received on Thursday, 4 January 2018 17:14:46 UTC