RE: Deidentification (ISSUE-188)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I agree, using a verb assumes that you already have data about people and you apply a de-identifying process to it. It is the process that is hard to define, without leaving loopholes. 

What is in scope is tracking data, and DNT should just mean do not collect it (unless you claim a permitted use). If you have collected in error just delete it. 

Maybe that is all we need to say.

Mike


> -----Original Message-----
> From: Rob van Eijk [mailto:rob@blaeu.com]
> Sent: 14 August 2014 19:55
> To: David Singer
> Cc: Justin Brookman; public-tracking@w3.org; Mike O'Neill
> Subject: Re: Deidentification (ISSUE-188)
> 
> The core of my issue, which may be a symantic issue, is that the current
> text is fixed on the word identification. To me it is not clear enough
> from the current definition that anything else than the 'one way street'
> is considered re-identification. The definition must be more specific on
> this point.
> 
> Does cookie-syncing (which is commonly used in real-time bidding) fall
> under the meaning of re-identification?
> 
> Rob
> 
> David Singer schreef op 2014-08-14 18:37:
> > Rob, I am sorry, I don’t follow you at all.
> >
> > We say in a number of places that data passes out of our scope, and
> > hence we say nothing at all about it, once it has been deidentified.
> > We need to define what we mean by that, and we need to define that
> > ‘exit’ from our scope.
> >
> > On Aug 14, 2014, at 2:08 , Rob van Eijk <rob@blaeu.com> wrote:
> >
> >>
> >> The text you propose connects the state of a permanently de-identified
> >> dataset to the possibility of identifying a user/user-agent or device.
> >> I think limiting the approach to identification is way too limited.
> >> What is not covered is for example:
> >> - the sharing (for e.g. data enrichment and data correlation).
> >
> > if it doesn’t identify anyone, and won’t/can’t, we have nothing to say
> > about sharing it
> >
> >> - the application of de-identified data to the individusl user/user
> >> agent/device (for e.g. re-targeting).
> >
> > That’s re-identification, and my text says (a) it ought not be
> > possible and (b) it ought not be permitted
> >
> >> - the retention of data meaning the duration of time that would be
> >> allowed to bring data in de-identified state.
> >
> > That’s a separate question: the ‘raw data’ question (and one of the
> > exits for raw data is that the data is deidentified)
> >
> >> - any (unintended/unforeseen) data uses that may have an impact on a
> >> (the personal space) of a user/user agent/device. For example
> >> re-targeting based on de-identified data, or re-targeting based on
> >> correlation with de-identified data.
> >
> > I don’t understand how one can target anyone if the data is
> > deidentified, and if it’s reidentified, then it wasn’t deidentified to
> > this definition (the definition insists it is a one-way street).
> >
> >>
> >> My proposal is to exclude text for de-identified data in order to aim
> >> for a cleaner specification.
> >
> > Again, I don’t understand.  The point of defining it is to say “how to
> > get out of the scope of this spec.”.  For example, the raw data clause
> > I proposed says there are only 3 exits:
> > * you have permission from the user to retain the data
> > * you retain the data under a permitted use, in accordance with the
> > terms of that permitted use
> > * you deidentify the data so it passes out of our scope
> >
> >
> >>
> >> Rob
> >>
> >> David Singer schreef op 2014-08-14 01:58:
> >>> On Aug 8, 2014, at 6:54 , Mike O'Neill <michael.oneill@baycloud.com>
> >>> wrote:
> >> (...)
> >>> Trying another way of phrasing it:
> >>> Data is permanently de-identified (and hence out of the scope of this
> >>> specification) when a sufficient combination of technical measures
> >>> and
> >>> restrictions ensures that the data does not, and cannot and will not
> >>> be used to, identify a particular user, user-agent, or device.
> >>> Note: Usage and/or distribution restrictions are strongly recommended
> >>> for any dataset that has records that relate to a single user or a
> >>> small number of users; experience has shown that such records can, in
> >>> fact, sometimes be used to identify the user(s) despite the technical
> >>> measures that were taken to prevent that happening.
> >>> David Singer
> >>> Manager, Software Standards, Apple Inc.
> >
> > David Singer
> > Manager, Software Standards, Apple Inc.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using gpg4o v3.3.26.5094 - http://www.gpg4o.com/
Charset: utf-8

iQEcBAEBAgAGBQJT7QwIAAoJEHMxUy4uXm2J7vkIAOUDdIGXlCpvJw9U/KYAbjCN
I/T2dcIsN3Bd095aNyj+eTiC32sQ96Tc5+q//f9zLx+/CERbIy5/lOhfEQpC6z4z
gQuJC/Ol691owAGEQFAQEN7sZ4u5nhFFuJzhPnZILBi9tzBj4wLByxskGgf3yMyT
rlYi50rZpTghA4QOKvszDxAgP/hyRnk2cjWcCCjaiMWVKQh3j7aKUtit4JgU/JKb
ME50WRt43StzEtcaFfsPGHzwVjG/3z5wqEMWSTnwuyq68OfN8U3g0hmaDhJUzwoU
P5+tPJOImfOSr0H5eCIXQkKLP6sz8HSrt+HPcNrAO/uKCmIGKlD4AAqSe5Ji0gI=
=oQfL
-----END PGP SIGNATURE-----

Received on Thursday, 14 August 2014 19:24:24 UTC