Re: [meetings] Agenda Request - (CPA?) Composable privacy-preserving architecture (#65)

@rmirisola Thanks for proposing this architecture. This is interesting and certainly looks like a scalable and long-term approach for handling complex advertising scenarios. I’ve been working with @taubeneck on a related problem for IPA, and wanted to share some ideas that might be helpful.

With respect to the first assumption, we anticipate that it will be critical for user-agent vendors to be able to specify the set of purposes (i.e., use cases) for a specific piece of encrypted data. 

We’ve been exploring an architecture to make all the data (either coming from a user device or a private computation network) purpose-constrained. In particular, the data must always remain encrypted and attached with some annotations that give context on which use cases are allowed to decrypt and use it. At the origin of this data, i.e., at the user device, the browser or mobile OS encrypts the user-generated data along with the annotations using AEAD encryption, keeping annotations public. (This could also happen at the server side.) Later, when this data is input into a private computation network for a given use case, the network first verifies whether the annotations allow using the data for that use case. If it allows, the network decrypts and computes. Otherwise, it drops them. After computation, the network will encrypt the resulting data (which can either be the final result or some intermediate result) and further annotate it with use cases that can reuse it. For a more clearer description, consider the following interface:

```
def private_computation(raw_input_rows, current_usecase, key_current_usecase, next_usecase, key_next_usecase):
 input_rows = []
 for row in raw_input_rows:
  if current_usecase in row.annotation.allowed_usecases:
   input_rows.append(row)

 decrypted_rows = decryption(input_rows, key_current_usecase)

        // privately compute the use case
 // in addition, add DP noise at the end, if it is the final result
        output_rows = compute_usecase(input_rows, current_usecase)

 annotated_output_rows = annotate(output_rows, next_usecase)
        encrypted_output_rows = encryption(annotated_output_rows, key_next_usecase)
        // we could also generate a new key and encrypt with that
 // in that case, new key has to be communicated to the next network

        return encrypted_output_rows
```

We consider two scenarios in such an architecture, where (1) we can trust the private computation nodes to adhere to the annotations, or (2) pick specific keys for each use case and strongly tie the data to be decryptable only by further allowed use cases (shown in the above pseudo code). The former is simpler, and is suitable in MPC type private computation, where we have distributed trust across t-out-of-n MPC nodes. While the latter is useful in TEE type computation, where we can attest and then delegate the keys to the specific TEE instance. However in the latter case, we need to assume a (set of) coordinator(s) who store and release the keys to verified TEE instances. 

By this approach, we can keep the data, either input or output, to be secure and consistent across the networks irrespective of what use case they solve. Thus preserving end-to-end privacy goals.


-- 
GitHub Notification of comment by sashidhar-jakkamsetti
Please view or discuss this issue at https://github.com/patcg/meetings/issues/65#issuecomment-1190647297 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 20 July 2022 19:04:09 UTC