Re: WebPerfWG call - November 9th 8am PT/ 11am ET/ 5pm CET

Thanks all for a great discussion yesterday!!
The meeting minutes are now published
<https://w3c.github.io/web-performance/meetings/2023/2023-11-09/index.html>
(alongside
both <https://www.youtube.com/watch?v=KpdhaGkfUII> recordings
<https://www.youtube.com/watch?v=4-ucgTW_lGc>)

Copying it here for convenience:
WebPerfWG call - November 9th 2023
Participants

   - Nic Jansma, Yoav Weiss, Alex Christensen, Sia Karamalegos, Charlie
   Harrison, Patrick Meenan, Sean Feng, Rafael Lebre, Dan Shappir, Tariq
   Rafique, Andy Lurhs, Barry Pollard, Mike Jackson, Hao Liu, Philip Tellis,
   Benjamin De Kosnik, Andrew Galloni, Amiya Gupta, Jacob Groß, Carine, Michal
   Mocny, Annie Sullivan

Admin

   - *Next meeting*: December 7th (same time)


   - (Not Nov 23, US thanksgiving)


   - Discussion with Fetch folks around fetchLater(), will send an invite
   for interested

MinutesDifferential privacy
<https://www.google.com/url?q=https://docs.google.com/presentation/d/1Q6XVQ_Cjy5aaaBJ4TEquQDFfHPk7nv0b68Lqivrx19Q/edit?usp%3Dsharing&sa=D&source=editors&ust=1699628224882589&usg=AOvVaw2vmbQZiy0jdnxP-kSjqb1Z>
-
Charlie Harrison

Recording
<https://www.google.com/url?q=https://youtu.be/KpdhaGkfUII&sa=D&source=editors&ust=1699628224883213&usg=AOvVaw3SfJ5WBhBkwwW6E7r0cqWG>

   - *Charlie*: On Google Chrome team on Privacy Sandbox APIs
   - ... Use a lot of differential privacy
   - ... Familiar with the concept
   - ... Extensive history, I can share references but we don't have a ton
   of time
   - ... Operates on a dataset, a group of people submitting their data
   - ... Raw data goes through a black box, outputs more data
   - ... Guarantee that DP gives you is after the magic, you won't be able
   to distinguish two outputs if they came from inputs that different from one
   individual
   -
   - ... Whether you participated in the system or not is not going to
   change the end result in a formal mathematic way
   - ... Assuming the parameters you choose are strong enough
   - ... Looked over this proposal for bimodal navigation
   - ... Local DP is something you're considering?
   - ... Here we assume dataset is one user, example, row
   -
   - ... Want guarantee that whatever the input, the data looks the same
   - ... If you can post-process this data in aggregate, you can get decent
   aggregate results, even if you look at one row from magic box and there's
   almost signal
   - ... High-level intuition
   - ... We want the input to not change the output
   - ... Formal definition
   -
   - ... We have some randomized algorithm M that satisfies epsilon
   differential privacy if for all pairs neighboring datasets and for all
   outputs x, we show the probability of producing output is e^epison
   - ... Mechanism induces a probability distribution
   - ... We look at every point in this distribution and we can show
   they're close
   - ... epsilon is measure of privacy loss
   - ... We'll go over a semantic interpretation of what this means in a
   few slides
   - ... When we think of a new feature on NavTiming, we can say that we
   want to make sure navigation is protected.  Or whether we want to protect
   all of a user's navigations over a timewindow.
   - ... Both parameters to definition are important to think through when
   designing a DP mechanism
   - ... We want to ensure we have this relation holdforth
   - ... A lot of math, but it's not too bad
   - ... How do we interpret epsilon?
   -
   - ... epsilon is notorious for being hard to interpret
   - ... Look at this in a bayesian sense
   - ... Adversary has a prior on the user's data, and if they have the
   output mechanism, they look at the output to compare with how much it
   changes
   - ... We can look at various values of epsilon to see how they update a
   prior
   - ... For graphic, X axis is suspicion, prior
   - ... Y is how that data looks after updating the data
   - ... Most private mechanisms won't update too far
   - ... When epsilon is low (purple) you're not straying too far
   - ... Higher epsilons you're straying too far
   - ... (examples on slide)
   - ... Might be able to stare at chart and consider adversary you're
   trying to protect against
   - ... Consider what priors they have for sensitive data
   - ... And what comfort level data is sensitive to
   - ... Less epsilon is more private
   - ... As it goes towards zero, probably of any output is equal no matter
   the input (signal being completely removed from system)
   - ... Most private is not revealing anything about the input
   -
   - ... DP is composable
   - ... If you have two separate inputs and they've invoked over same
   dataset, epsilons add up
   - ... If you're trying to protect nav performance and you have a noisy
   bit, and want to introduce another bit, you can sum epsilons that the
   privacy loss is of the combined mechanisms
   - ... Post processing: Staring at output of DP mechanism and and trying
   random inputs is not going to cause privacy loss
   - ... Group privacy: If you want to protect more than one user, a group,
   you can use DP as well
   - ... Adding or removing a whole group of K people, you can protect and
   make sure their input is not overly affecting the output
   -
   - ... Useful for bimodal navigation proposal
   - ... Compute flip probability based on epsilon parameter, you pick a
   random bit
   - ... In above example, giving truthful answer 25% of the time and false
   75% of the time
   -
   - ... Laplace invokes laplace distribution
   - ... Not necessarily local DP mechanism
   - ... Takes in as sequence of booleans, sum fhem up, centered around
   true value, scaled to epsilon
   - ... Gives probably dist centered around true value
   - ... Most of the time you'll get around 30
   -
   - ... How does this relate to systemEntropy proposal?
   - ... Considerations when thinking of adding DP to something
   - ... (1) Consider what is the neighboring relation, scope of what
   you're trying to protect
   - ... Protect entropy of a particular navigation?  Or lots of users'
   navigation
   - ... Describe different kinds of protections
   - ... (2) DP will add noise to the data
   - ... Have to make sure noise is OK, don't want to ship something so
   noisy it's not going to be used
   - ... Want to do research and vetting
   - ... (3) Think about what mechanisms are available
   - ... Two generate techniques, local DP vs central DP
   -
   - ... Think about the systemEntry as inducing a bimodal distribution
   - ... Generated some random gaussian sample distribution
   - ... If you naively apply response to histograms, you can get something
   scrambled and biased
   - ... If we randomly flip every input, and put it in low/high histogram,
   it's going to be mixed together
   - ... Mass from low-entry bucket moved to high-entropy bucket
   - ... Now looks like high-entropy is more performant than the real
   answer would be
   - ... Ways to de-bias data
   - ... What this would look like after using unbiased bucket estimator
   - ... Third graph similar to the first one
   - ... Overall not terrible

Bimodal navigation and fuzzy dimensions
<https://www.google.com/url?q=https://docs.google.com/presentation/u/0/d/1RcLOWD9lIMJIf7cvAUcsgxwSWiZoowxeNkctj1kzHKA/edit&sa=D&source=editors&ust=1699628224893390&usg=AOvVaw0WhrB46rNh6I3qTOD5RY0x>
-
Mike Jackson

Recording
<https://www.google.com/url?q=https://youtu.be/4-ucgTW_lGc&sa=D&source=editors&ust=1699628224894012&usg=AOvVaw21wAK9BdBpFjUVB7OnyxM1>

   - *Mike*: Brief overview for folks
   - ... Goals are to allow developers to eliminate outlier metrics when
   user-agent is in a non-optimal performance state
   - ... Data indicates that delays can happen any time in navigation
   pipeline
   - ... Avoid being a real-time method to determine if a user-agent is in
   a non-optimal performance state
   - ... Current code checked into Chromium
   - ... Origin Trial in Edge
   - ... High level feedback:
   - ... No one likes the name
   - ... Concern is as more data gets added, value degrades over time
   - ... Defining UA startup is difficult
   - ... PerformanceTimeline shouldn't be used to detect info yuo can't get
   in other ways
   - ... Need to expose in privacy preserving way, avoid side-channel
   attacks or user-fingerprinting
   - ... Proposed changes:
   - ... Rename to navigationStartedDuringUserAgentLaunch (boolean)
   - ... Should we expose this off of navigator or performance object?
   - ... The preserve privacy use DP?
   - ... We have good data with systemEntropy
   - ... Proposal is to adopt DP changes first, run an OT
   - ... Look at changing name after it's finalized
   - ... Proposal to adapt randomized response algorithm
   - ... Verified DP algorithm's output, truth 75% of the time
   - ... Proofs play out in the data I'm seeing

Discussion

   - *Charlie*: Feedback on plan, given we can simulate effect of
   randomized response, I would encourage running a more sophisticated
   analysis of real data before running another OT
   - ... If we have good data for it
   - ... I feel like we could get a sense of the effect of this mechanism,
   using existing data
   - ... The test was good, you could take what I had on my last slide and
   see how we could partition histograms, etc
   - ... That might be a good place to start, vs. high overhead of OT
   - *Yoav*: Suggesting to take a pile of real-life NT data with
   systemEntropy bit and run simulate a randomization of that bit, and see if
   the data still makes sense roughly
   - *Charlier*: Not an expert, if the data exists, did we run an OT
   without noise?
   - *Mike*: Yes we have partners that have done it
   - *Charlie*: Easy assuming we can get access to the data
   - ... Run a simulation of the algorithm, test different values of
   epsilon with overhead of an OT launch
   - *Michal*: Last slide answered preemptively how you could denoise the
   data after the fact
   - ... Curious, do you need to know the specific algorithm and epsilon
   chosen to reverse-engineer?
   - *Charlie*: Yes, you do
   - ... One-liner, operating on a per-bucket basis
   - ... Every bucket, look at count per-bucket, total across both high-
   and low-systemEntropy, look at P, it's just this formula
   - ... Deployment problem I would say
   - ... If we add a noisy systemEntropy bit and we expect epsilon not
   hard-coded in spec, I suggest we expose epsilon value or flip probability
   - *Michal*: That was my second question
   - ... For other APIs, are they sometimes spec'd, exposed?
   - *Charlie*: For Attribution Reporting API, we allow you for binary
   randomized response, flipping a bit, we specify a K, in that case you can
   choose that flip rate
   - ... We expose what P is in our API
   - ... I don't know of too many other APIs that have introduced DP, if
   all vendors are aligned on an epsilon, it could be hardcoded
   - ... There could be another vendor that wants to do something
   differently
   - ... Err towards including it
   - ... Align for low values of epsilon if that’s tolerable, any more
   privacy conscious browser would be fine
   - ... Utility/privacy tradeoff
   - ... How to do this algorithm for multiple values of P and combine
   that, I might have to talk to a data scientist
   - *Michal*: wondered about future changes. Exposing the epsilon makes
   sense to me.
   - *Nic*: Thanks for presenting this! Generic web developer perspective,
   I wonder how much they’d be able to use this without some significant
   training. Seems like a lot.
   - … Seems like it’d be challenging to help people understand this
   - … A second thought from a RUM provider’s point of view - customers
   expect data to be represented accurately. Because this data is
   intentionally not accurate, I wonder how it’d be able to be used
   - … Or if we’d need to present this data in some different way
   - … maybe correct in aggregate, but we provide waterfalls of individual
   views, so we’d need to caveat that
   - … Will lean on data scientists to figure out how we can use this data
   - *Yoav*: From a web developer's perspective, I think the randomized
   response algorithm doesn't need all the math
   - ... Response is randomized, this is the randomization percent, can use
   that as a dimension to split your data on, some cases where you're throwing
   away good data and some where you're incorporating bad data
   - ... And it's MAGIC
   - ... I don't think developers would need to dive into correctness proofs
   - ... On RUM provide side, I don't think this is a useful for us to deal
   with measurements
   - ... But useful way to deal with dimensions
   - ... For measurements for some sort of aggregation
   - ... Accumulate a lot of measurements, provide a private summary of that
   - ... For dimensions, apply to data we're already collecting, this will
   enable us to drop the navigations we don't currently care about
   - ... Deep-dive on a waterfall, that deep-dive will have to be caveated
   - ... Piece would need some communication
   - ... If we compare to current state where we don't have access to this
   data at all, this would be an improvement
   - *Andy*: I feel like LCP isn't as complicated but has caveats and
   characteristics that we figured out a couple times, it's not a perfect 1:1
   what happened, it's a best-effort
   - *Amiya*: Echoing what Yoav mentioned, booleans or a discrete set of
   options, or distributions.  For perf metrics, depending on the metrics
   distributions may be different, not everything is normally distributed
   - ... I wonder how mathematically, with a lot of data points it's fine
   - ... But with splitting other dimensions does it introduce data
   - *Yoav*: Maybe exposing epsilon will give better guidance "this data
   makes sense only if K samples" from various users, don't look at it
   otherwise
   - *Charlie*: I think with data from API you could compute error bars
   - ... With histogram you can look at variance per bucket
   - ... Could present dashboard with some things omitted with error bars
   too high
   - ... One approach
   - ... I heard two things from this convo.  From Yoav, not doing fancy
   math?  Let's verify that
   - ... What's the best way to consume this data and have an agreement,
   very important thing
   - *Michal*: On the topic of dimensions and agg reporting, we don't know
   what a given developer's dimensions are.  They might measure A/B with
   perfect precision, could give a perfect distribution
   - ... I love this mechanism because we're saying if there's a difference
   on this dimension between two sets of users, we're going to shrink your
   ability to measure that, but you can still adjust it
   - ... Aggregate reporting API is powerful, but we don't know what
   dimensions are.  Do we need a first-class concept for dimensions, and
   on-purpose we're controlling this dimension
   - ... It's going to be a can of worms, this is middle-ground
   - *Yoav*: User-defined dimensions is important for aggregated reporting
   - *Dan*: Related to concept of RUM provider showing data for a specific
   session
   - ... Obviously that's the thing you lose, I hate opt-ins, but we might
   think of some sort of consent for opting out of adding of the noise
   - ... Might be a possible solution
   - *Yoav*: Can work if debugging a specific session, ask them to give
   more information
   - ... This is one where a specific permission could be better
   - ... I suspect for RUM provider showing a Waterfall, that's not
   necessarily where a user complained
   - *Michal*: To Charlie's point, if we support variable epsilon, it could
   be the UA's decision, you could factor that in
   - ... Envision an explicit call to ask for an undifferentiated value the
   user has to accept
   - ... To the point of entropy, the privacy principles were really well
   laid-out.
   - ... It's either necessary data or auxiliary data, gives us a middle
   ground to treat aux as something else
   - ... Are signals useful on their own
   - *Yoav*: Agree, and in this case the Compute Pressure effort is similar
   and different
   - ... I don't know we should tie this to that spec
   - ... Not sure if this data will be exposed down the road
   - ... Variable epsilon would be useful in that case
   - ... I don't think this changes whether this data is ancillary or not
   - ... This is one of those mitigations


On Thu, Nov 9, 2023 at 5:09 AM Yoav Weiss <yoavweiss@google.com> wrote:

> Thanks for the suggestion Michal! +Charlie Harrison
> <csharrison@google.com> kindly agreed to give such an overview!
>
> On Wed, Nov 8, 2023 at 8:41 PM Michal Mocny <mmocny@google.com> wrote:
>
>> Is this the first such proposal in this group?
>>
>> Could we perhaps start by carving out some time specifically to discuss
>> just differential privacy
>> <https://en.wikipedia.org/wiki/Differential_privacy> on its own, and
>> then switch to this specific use case?  Perhaps that was already your plan!
>>
>> On Wed, Nov 8, 2023 at 12:25 PM Yoav Weiss <yoavweiss@google.com> wrote:
>>
>>> People tell me that "bimodal navigation data and exposing that as a
>>> differentially-private fuzzy dimension" is not super clear..
>>>
>>> The subject of the discussion would be Microsoft's proposal to add an
>>> extra dimension to Navigation Timing, to indicate that a navigation was
>>> started when the system was under load. That dimension is proposed to be
>>> privacy-preserving through differential privacy
>>> <https://en.wikipedia.org/wiki/Differential_privacy>. See y'all
>>> tomorrow!!
>>>
>>>
>>> On Wed, Nov 8, 2023 at 3:22 PM Yoav Weiss <yoavweiss@google.com> wrote:
>>>
>>>> Hey folks,
>>>>
>>>> Tomorrow on the agenda
>>>> <https://docs.google.com/document/d/10dz_7QM5XCNsGeI63R864lF9gFqlqQD37B4q8Q46LMM/edit?pli=1#heading=h.fa3nzaq1bwpc>
>>>> we have a discussion on bimodal navigation data and exposing that as a
>>>> differentially-private fuzzy dimension.
>>>>
>>>> See <https://meet.google.com/agz-fbji-spp?authuser=0> y'all there!
>>>>
>>>> Cheers :)
>>>> Yoav
>>>>
>>>

Received on Friday, 10 November 2023 15:20:53 UTC