- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Fri, 17 Jun 2022 09:59:21 +0100
- To: David Lewis <delewis@tcd.ie>, public-dpvcg@w3.org
Hi Dave. Thanks for the comments. My replies are inline. On 17/06/2022 08:59, David Lewis wrote: > A quick comment on the technology subject definition : > > "TechnologySubject and hasSubject is the subject of the technology i.e. > whom the technology is used on or subjected to. This may be directly > (e.g. person within a CCTV camera's vision) or indirectly (e.g. person > whose details were used as training data)" > > Breaking this down, I think we need to be more precise definition of > being 'subjected to' technology. I > > If you parse this as the option ""TechnologySubject and hasSubject is > the subject of the technology i.e. whom the technology [is used on or] > subjected to." that essentially a circular definitions, so doesn't tell > you much. I don't think the definition is necessarily circular but I agree it can always be further refined/clarified. What is considered "subject" or "subjected" to can vary based on legal or other interpretations, so there is no good existing complete definition AFAIK. The concept of "subject" here mirrors that of "data subject" - i.e. who is the individual the data is about, similarly who is the individual on whom (or on whose data) the technology is applied. I think there will be a few iterations to smooth out the definitions. > > If you parse it as: "TechnologySubject and hasSubject is the subject of > the technology i.e. whom the technology is used on [or subjected to.]", > that's a bit better but might leave the reader still wondering what > classifies as 'used on'. In the introduction, there are some hints: "This may be directly (e.g. person within a CCTV camera's vision) or indirectly (e.g. person whose details were used as training data)." > > Further, neither of these to my mind necessarily imply case that your > data is used for training. In this case this it is more that the > technology is built with your data (which is perhaps sufficiently > captured by the GDPR definition of data subject) but doesn't necessarily > imply that the technology is used 'on' you, or even that your are ever > 'subjected to' the technology. Not all training data automatically denotes someone to become a (data or technology) subject IMO. If all that was collected was my height with no further individual identifiers, I am not automatically a data subject in some use of that data becuase there needs to be some identifiability. So the notion of subject can be quite complex - it can be technology used on you, with you, about you, etc. etc. > > So I might suggest > > i) define 'subjected to' instead as 'affected or potentially affected by > the technology' (note this wording is in part inspired by the general > definition of 'stakeholder' in ISO This is NOT the "subject" but rather an "affected individual" or "stakeholder (ISO)" - different concepts. So it would not be correct to use that definition becuase it goes beyond the use of technology and into the realm of figuring out impacts of effects - which is not the intention of the modelling actors in technologies. Semantically, the notion of "affected or impacted" is separate from that of "subject" IMHO. Because someone can be affected without ever being a subject - such as due to secondary effects or unrealised impacts, strong example being use of shared geneteics. Also not all those who are affected need to be the subjects (depending on the definition of subject). Same as under GDPR the affected individuals can be other than data subjects. The final nail would be defining "affected" - and I think this could mean the organisation developing or providing technology and the people that work there could also potentially end up being "subjects" - which would be wrong to. So you see how this takes the notion of "subject" closer to "stakeholder" - and I argue that they are separate. All the impacts, risks, etc. notions are in main DPV and should be used from there. This is just for describing the technologies used. I wouldn't like to duplicate the impacts/risks etc. again in an extesion. So for indicating who is affected, I think the data subject and risk/impact sections in DPV should be used. > > ii) consider treating actors whose data is used for training separately > somehow, e.g. by just relying on the existing 'data subject' definition. Yes, the introduction text does state this possibility as: "This may be directly (e.g. person within a CCTV camera's vision) or indirectly (e.g. person whose details were used as training data). What is considered a subject may be contextually dependant on the nature and scope of the technology as well as its application. In the future, we may separate this concept for further distinction between direct and indirect subjects (or use alternate terms) - if such categorisation is deemed beneficial in the description of individuals subjected to technologies." > > I acknowledge that the implication of these suggestions is that if a > person's data is fully anonymised and used for ML training and the > resulting technology does not affect or potentially affect that person, > then they would fall out of the definition of 'technology subject'. But > I think that's probably OK. Yes, this is what is intended. Note that the concept "technology subject" is complimentary to "data subject" - so that person would be the data subject in the training phase. At some point in the future, providing these concepts (i.e. training data, training phase) would be something to consider. But for now my intention is to have DPV v1 in terms of data protection / privacy with priority. Regards, -- --- Harshvardhan J. Pandit, Ph.D Research Fellow ADAPT Centre, Trinity College Dublin https://harshp.com/
Received on Friday, 17 June 2022 08:59:40 UTC