- From: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
- Date: Fri, 13 May 2016 09:42:38 -0300
- To: Annette Greiner <amgreiner@lbl.gov>
- Cc: Phil Archer <phila@w3.org>, Public DWBP WG <public-dwbp-wg@w3.org>
- Message-ID: <CANx1Pzwez0J4Y7Kte59n3CHVtJC15yKNT=YLqxgBzKqBFbD2dg@mail.gmail.com>
Hi Phil and Annette, Thanks a lot for your contributions! cheers, Berna 2016-05-07 17:05 GMT-03:00 Annette Greiner <amgreiner@lbl.gov>: > Hi Phil, > Thanks for letting me weigh in. I understand the connection you’re making > here, and I think it’s a good thing to mention in the enrichment section. > What I think is crucial but is not yet reflected in here is the issue of > privacy breach arising from putting together disparate data that presents > less risk separately. The second paragraph here is a good but more general > discussion of security and privacy issues that strikes me as not belonging > in this particular section. I would suggest instead addressing the more > general issues in the introduction to our document. Most of the third > paragraph would also be better in the document introduction, but the last > sentence is relevant here. As I see it, the real issue with data enrichment > is combining datasets that each hold so little information about any > individual that they cannot be identified but that together offer enough > information that they can be. I would suggest that here we just say, > > > Data enrichment refers to a set of processes that can be used to > enhance, refine or otherwise improve raw or previously processed data. This > idea and other similar concepts contribute to making data a valuable asset > for almost any modern business or enterprise. It is a diverse topic in > itself, details of which are beyond the scope of the current document. > However, it is worth noting that some of these techniques should be > approached with caution, as ethical concerns may arise. In scientific > research, care must be taken to avoid enrichment that distorts results or > statistical outcomes. For data about individuals, privacy issues may arise > when combining datasets. That is, enriching one dataset with another, when > neither contains sufficient information about any individual to identify > them, may yield a combined dataset that compromises privacy. Furthermore, > these techniqes can be carried out at scale, which in turn highlights the > need for caution. > > Then, in the document introduction, I would suggest adding the following, > after the paragraph that begins “In this context…”. > > > Not all data should be shared openly, however. Security, commercial > sensitivity and, above all, individuals' privacy need to be taken into > account. It is for data publishers, not a technical standards working > group, to determine policy on which data should be shared and under what > circumstances. Data sharing policies are likely to assess the exposure risk > and determine the appropriate security measures to be taken to protect > sensitive data, such as secure authentication and authorization. > > > > Depending on circumstances, sensitive information about individuals > might include full name, home address, email address, national > identification number, IP address, vehicle registration plate number, > driver's license number, face, fingerprints, or handwriting, credit card > numbers, digital identity, date of birth, birthplace, genetic information, > telephone number, login name, screen name, nickname, health records etc. > Although it is likely to be safe to share some of that information openly, > and even more within a controlled environment, publishers should bear in > mind that combining data from multiple sources may allow inadvertent > identification of individuals. > > (I took out mention of https, as it will soon be everywhere, which would > make our doc out of date.) > > Also, I noticed a grammatical error in the implementation section of BP > 31. (Subject-verb agreement is off.) It should read "Techniques for data > enrichment are complex and go well beyond the scope of this document, which > can only highlight the possibilities." > -Annette > > > On May 6, 2016, at 7:50 AM, Phil Archer <phila@w3.org> wrote: > > > > Berna, > > > > As promised, I've copied the text from the sensitive data section and > merged some of it with the data enrichment intro to end up with this as a > suggestion. > > > > @Annette - we resolved to do this and move the BP about data > unavailability to the data access section. Do you agree with this? > > > > ===Begins== > > > > Data enrichment refers to a set of processes that can be used to > enhance, refine or otherwise improve raw or previously processed data. This > idea and other similar concepts contribute to making data a valuable asset > for almost any modern business or enterprise. It is a diverse topic in > itself, details of which are beyond the scope of the current document. > However, it is worth noting that techniques exist to carry out such > enrichment at scale which in turn highlights the need for caution. > > > > Not all data should be shared openly. Security, commercial sensitivity > and, above all, individuals' privacy need to be taken into account. It is > for data publishers, not a technical standards working group, to determine > policy on which data should be shared and under what circumstances. Data > sharing policies are likely to assess the exposure risk and determine the > appropriate security measures to be taken to protect sensitive data, such > as secure authentication and use of HTTPS. > > > > Depending on circumstance, sensitive information about individuals might > include: full name, home address, email address, national identification > number, IP address, vehicle registration plate number, driver's license > number, face, fingerprints, or handwriting, credit card numbers, digital > identity, date of birth, birthplace, genetic information, telephone number, > login name, screen name, nickname, health records etc. Although it is > likely to be safe to share some of that information openly, and even more > within a controlled environment, publishers should bear in mind that data > enrichment techniques may allow some elements to be discovered and linked > from elsewhere. > > > > Notwithstanding that caution, data enrichment offers exciting > possibilities for both data publishers and consumers. > > > > > > > > == ends== > > > > -- > > > > > > Phil Archer > > W3C Data Activity Lead > > http://www.w3.org/2013/data/ > > > > http://philarcher.org > > +44 (0)7887 767755 > > @philarcher1 > > -- Bernadette Farias Lóscio Centro de Informática Universidade Federal de Pernambuco - UFPE, Brazil ----------------------------------------------------------------------------
Received on Friday, 13 May 2016 12:43:30 UTC