Re: Adding categories of data subjects

Hi All.
As discussed in the last meeting, below are my thoughts on how we can go 
about modelling the various concepts.

---
# Data Subjects being Active/Passive

- dpv:DataSubjectActive: (subclass of dpv:DataSubject) Categorisation of 
data subjects that have an active involvement or participation in the 
processing of data. For example, filling in a form or using a webcam to 
conduct virtual meetings.
- dpv:DataSubjectPassive (subclass of dpv:DataSubject): Categorisation 
of data subjects that do not have an active involvement or participation 
in the processing of data. For example, monitoring of customers within a 
shop.

Example:

ex:001 dpv:hasDataSubject ex:BankLobbyVisitors .
ex:BankLobbyVisitors a dpv:DataSubjectPassive ;
   dct:description "Visitors within the lobby"@en .

ex:002 a dpv:PersonalDataHandling ;
   dct:description "Video Calls for Meetings"@en ;
   dpv:hasPurpose dpv:ServiceProvision ;
   dpv:hasDataSubject ex:ServiceSubject, ex:BackgroundSubject .
ex:ServiceSubject a dpv:DataSubjectActive ;
     dct:description "User performing the video call"@en .
ex:BackgroundSubject a dpv:DataSubjectPassive ;
     dct:description "Subjects in the background"@en .

---

# Entities with Active/Passive involvement

Active/Passive involvement for data controllers and recipients is tricky 
to define. Active participation occurs when an entity implements the 
processing or decides how the processing should be implemented. We 
already can state who is carrying out processing using 
isImplementedByEntity, and have a proposal to indicate who is deciding 
what/why/how processing is carried out using isDeterminedByEntity (see 
later in the email). However, the determination aspect does not 
correlate to being 'active' e.g. a processor may determine how storage 
is implemented, but it may not have an 'active' role in deciding 
what/how data is stored in cases such as cloud storage.

It is an open question whether there is a need or any value in having 
the ability to express which entities are 'actively' or 'passively' 
involved for any reason within the processing - where specific roles 
(e.g. controller) may not be known or may not be accurate. See below 
concepts and example to see where this can be used/useful.

hasActiveInvolvement: (subproperty of hasInvolvement) represents the 
entity has an active involvement and participation in the indicated 
activity.
hasPassiveInvolvement: (subproperty of hasInvolvement) represents the 
entity  has a passive rather than an active involvement and 
participation in the indicated activity.
hasNoInvolvement: (subproperty of hasInvolvement) represents the entity 
has no involvement or participation in the indicated activity.

ex:001 a dpv:PersonalDataHandling ;
   # initial findings - Acme sends data to Beta
   dpv:hasProcessing dpv:Transfer ;
   dpv:hasActiveInvolvement ex:Acme ;
   dpv:hasPassiveInvolvement ex:Beta ;
   dpv:hasNoInvolvement ex:Gamma ;
   dpv:isImplementedByEntity ex:Acme, ex:Beta .
   # discovery: Acme has an agreement with Gamma deciding this
   dpv:isDeterminedBy ex:Acme, ex:Gamma ;
   # conclusions - Acme and Gamma are joint-controller, Beta is processor
   dpv:hasDataController ex:Acme, ex:Gamma ;
   dpv:hasProcessor ex:Beta .

---

# Informed / Uninformed Status

- dpv:EntityInformedStatus: The status of the indicated entity being 
informed of the context.
- dpv:EntityInformed: The state of the entity being informed of the context.
- dpv:EntityUninformed: The state of the entity being informed of the 
context.

- dpv:DataSubjectInformed: The state of the data subject being informed 
of the context.
- dpv:DataSubjectUninformed: The state of the data subject being 
informed of the context.
- dpv:ControllerInformed: The state of the controller being informed of 
the context.
- dpv:ControllerUninformed: The state of the controller being informed 
of the context.
- dpv:RecipientInformed: The state of the recipient being informed of 
the context.
- dpv:RecipientUninformed: The state of the recipient being informed of 
the context.
- dpv:AuthorityInformed: The state of the authority being informed of 
the context.
- dpv:AuthorityUninformed: The state of the authority being informed of 
the context.

Example - three perspectives on the same activity:

# data subjects have been informed
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasStatus dpv:DataSubjectInformed .

# data subjects have been informed using this notice
ex:002 a dpv:PersonalDataHandling ;
   dpv:hasStatus [
     a dpv:DataSubjectInformed ;
     dpv:hasNotice ex:Notice ;
   ] .

# this notice is meant for data subjects (doesn't say informed)
ex:003 a dpv:PersonalDataHandling ;
   dpv:hasNotice [
     a dpv:Notice ;
     dpv:hasRecipient dpv:DataSubject ;
   ] .

---

# Intended/Unintended Status

Intended and unintended are likely to be useful when expressing facts on 
a ex-post basis e.g. findings from activities for what data has been 
processed, of whom, and of the recipients. It is also useful to 
highlight cases during assessments (e.g. DPIA) for what is meant to 
happen and what is not.

- dpv:IntentionStatus: the status of whether the specified context 
is/was intended or unintended
- dpv:StatusIntended: the state where the specified context is/was 
intended to occur
- dpv:StatusUnintended: the state where the specified context is/was not 
intended to occur

Examples:

# We intended to collect customer data
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasProcessing dpv:Collect ;
   dpv:hasDataSubject dpv:Customer ;
   dpv:hasStatus dpv:StatusIntended .
# We did not intend to collect children data
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasProcessing dpv:Collect ;
   dpv:hasDataSubject dpv:Child ;
   dpv:hasStatus dpv:StatusUnintended .

---

# Expected/Unexpected Status

A counterpart to Intent, the state of expectation is useful to describe 
when an entity does not have control over the processing to carry out 
its intent and must instead check whether the facts match their 
'expectations'. This could be a data controller's expectation from a 
processor (regarding processing) or from a recipient third party 
(regarding fulfilment of rights); or by a data subject regarding the 
processing of their data.

- dpv:ExpecationStatus: the status of whether the specified context 
is/was expected or within expecations.
- dpv:StatusExpected: the state where the specified context is/was 
expected to occur or is/was within expectations
- dpv:StatusUnexpected: the state where the specified context is/was not 
expected to occur is/was not within expectations

Examples (data subject's POV for a service):

# collect emails
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasProcessing dpv:Collect ;
   dpv:hasPersonalData pd:Email ;
   dpv:hasStatus dpv:StatusExpected .

# conduct profiling
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasProcessing dpv:Profiling ;
   dpv:hasStatus dpv:StatusUnexpected .

---

# Determination

An indication of which entity "determined" the context. This could be 
used for anything, e.g. data collection, purpose, recipient, technical 
measure, specific encryption protocol. It is useful to cover the 
"determination of means and purposes" aspect of GDPR's investigations. 
In most cases, the specified controller should be the one who determined 
the activities - but ex-post analysis may reveal that other entities 
were involved in the determination as well (which could make them 
joint-controllers under GDPR).

- dpv:isDeterminedByEntity: an indication of who (entity) determined the 
specified context. Determination reflects the entities involved in 
deciding or influencing how/why/where/when/etc. the specified context 
should be carried out.

# Acme uses Beta's emailing services
# Beta decides what data is needed (email)
# Beta decides how emails are collected and stored
# Acme decides how emails are used to provide Service
ex:001 a dpv:PersonalDataHandling ;
   dpv:hasDataController ex:Acme ;
   dpv:hasDataProcessor ex:Beta ;
   dpv:hasPersonalDataHandling [
     dpv:hasPurpose dpv:ServiceProvision ;
     dpv:isDeterminedByEntity ex:Acme .
   ] ;
   dpv:hasPersonalDataHandling []
     dpv:hasProcessing dpv:Collect, dpv:Store ;
     dpv:hasPersonalData pd:Email ;
     dpv:isDeterminedByEntity ex:Beta .
] .

---

Regards,
Harsh

On 12/10/2023 23:14, Harshvardhan J. Pandit wrote:
> Hi.
> Upon thinking some more about this, I like Beatriz's suggestion of going 
> with statuses, but to also have entity specific concepts. See below 
> notes and examples. In the process, I discovered two more concepts: 
> Determination and Expected/Unexpected. Please scrutinise these with due 
> diligence.
> 
> # Active/Passive
> - represents 'active involvement' of an entity
> - cannot work on its own as a status without a subject i.e. who/what is 
> active?
> - e.g. Active - Data Subject? Controller? Recipient?
> - unlikely to change, active subject won't become passive
> - use status per entity i.e. DataSubjectActive, ControllerActive
> - can also be categories, but status for consistency with other concepts
> 
> # Informed/Uninformed
> - represents whether the specified entity was informed about the 
> associated processing or context
> - same as active, requires a subject
> - e.g. Informed - Data Subject? Controller? Recipient?
> - likely to change, uninformed subject can become informed
> - 'informed' is contextual - subject may be uninformed elsewhere
> - use status per entity i.e. DataSubjectInformed, ControllerInformed
> 
> # Intended/Unintended
> - Intent can be approached from either side: e.g. Customers as data 
> subjects were intended by the Controller - here the target concept is 
> Intended Data Subjects; or it was the Controller's intent to have 
> Customers as data subjects - here the target concept is the Controller's 
> intent.
> - we should use intent for the first i.e. applicability, and for the 
> second we should use 'determination' as a concept (see later)
> - intended (thereby) represents whether the specified context was 
> intended by the responsible entity
> - intent is not likely to change
> - use generic status i.e. StatusIntended, StatusUnintended - the context 
> is sufficient to state the subject i.e. data subject is intended
> - Question: do we need to distinguish the perspective here e.g. 
> controller's perspective of being intended vs data subject's?
> 
> # NEW: Determination
> - determination represents who decides or determines the processing or 
> context specified e.g. purpose is determined by controller or data subject
> - new relation `isDeterminedBy` to indicate the concept was determined 
> by the indicated entity
> - can be an accompaniment to Intended to denote whose determination (or 
> purpose and means of processing) causes the intention to materialise
> - is crucial to understand accountability and involvement e.g. EDPB 
> guidelines on Controllers and Processors
> - is also a nice addition to distinguish difference in determination by 
> providers and consumers within processing activities
> 
> # NEW: Expected/Unexpected
> - see 
> https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/lawful-basis/legitimate-interests/what-is-the-legitimate-interests-basis/
> - intent is when you decide, expect is when you don't decide (Mark will 
> probably like seeing expected as a concept)
> - e.g. a controller has both intended and expected processing from a 
> processor;
> - e.g. a processor only has intended processing on instruction from 
> controller;
> - e.g. a person (reasonably) expects what data is being processed is 
> different from the person intending some data to be processed
> - Question: whether something can ever by Intended and Unexpected, or 
> Unintended but Expected? Seems unlikely for the same entity, however it 
> can happen that one entity intends something that is unexpected to 
> another entity.
> - Caveat: being informed does not (by itself) make something expected - 
> that requires being informed to be followed by comprehension to create 
> the expectation
> 
> # --- Implementation Examples --- #
> 
> ```turtle
> @prefix dpv: <https://w3id.org/dpv#> .
> @prefix ex: <https:example.com/> .
> 
> # Method 1: Explicit concepts for every variation - NOT SUITABLE
> # e.g. Intended: Data Subject, Processing, etc.
> # Pros: explicit, directly usable
> # Cons: Lots of "Intended" concepts
> ex:PDH1 dpv:hasDataSubject [
>          a dpv:Customers ;
>          a dpv:IntendedDataSubject ;
>      ] ;
>      dpv:hasProcessing [
>          a dpv:Collect ;
>          a dpv:IntendedProcessing ;
>      ] .
> 
> # Method 2: Generic status for categories - SUITABLE
> # e.g. StatusIntended, StatusUnintended
> # Pros: less concepts, can be used in any context
> # Cons: requires complex 'nesting' inside concepts
> ex:PDH2 dpv:hasDataSubject [
>          a dpv:Customers ;
>          dpv:hasStatus dpv:StatusIntended ;
>      ] ;
>      dpv:hasProcessing [
>          a dpv:Collect ;
>          dpv:hasStatus dpv:StatusIntended ;
>      ] .
> 
> # Method 3: Same as Method 2, but to use at PDH level - IDEAL
> # Pros: can neatly indicate which 'activities' are intended
> # Cons: requires discipline - everything in that PDH is intended
> ex:PDH3 dpv:hasPersonalDataHandling [
>          dpv:hasDataSubject dpv:Customers ;
>          dpv:hasProcessing dpv:Collect ;
>          dpv:hasStatus dpv:StatusIntended ;
>      ] ;
>      dpv:hasPersonalDataHandling [
>          dpv:hasDataSubject dpv:Pedestrians ;
>          dpv:hasProcessing dpv:Collect ;
>          dpv:hasStatus dpv:StatusUnintended ;
>      ].
> ```
> 
> Regards,
> Harsh
> 
> On 09/10/2023 12:20, Harshvardhan J. Pandit wrote:
>> Addendum: these categories also apply to other entities e.g. 
>> Controllers --. whether the processing was intended or not, whether 
>> the Controller had an active involvement in the processing, and 
>> whether the Controller was informed about the processing.
>>
>> Whether this information should be in scope (IMHO - strongly yes to 
>> represent facts) and whether we should model this with the same or 
>> different concepts is to be discussed. I am leaning towards separate 
>> concepts for Processing and Data Subjects.
>>
>> - Harsh
>>
>> On 09/10/2023 12:12, Harshvardhan J. Pandit wrote:
>>> Hi. To answer in order:
>>>
>>> Art's question of whether these would be 6 categories - yes.
>>> - Intended / Unintended
>>> - Active / Passive
>>> - Informed / Uninformed
>>>
>>> Beatriz's question on modelling these as statuses.
>>> - That's a good question. tldr; status does seem a better 'semantic 
>>> model', but is also used as a category in common use.
>>> - We use 'Status' in DPV to provide context to another concept with 
>>> the expectation that that context will change. In this case, only the 
>>> Informed/Uninformed categorisation seems likely to change. The 
>>> Active/Passive and Intended/Unintended are categorisation of data 
>>> subjects that do not seem likely to change, but can still be statuses.
>>> - If you want to model this information on a data subject 
>>> group/individual level, then status can be useful e.g. a specific 
>>> individual - was informed or not? Same can be achieved with a 
>>> category e.g. data subject is of 'type' informed.
>>> - One benefit of statuses over categories is to indicate within 
>>> processing policies whether data subjects have been informed as a way 
>>> to keep track of it e.g. hasDataSubjectStatus <Informed>. This is in 
>>> addition to using hasNotice <Notice> to indicate the information.
>>> - Active/Passive can similarly be statuses to depict "involvement"
>>> - Intended/Unintended should be categories
>>>
>>> Mark's question on whether it is possible to represent status of 
>>> notice as being current - Conformant/NonConformant concepts exist 
>>> which can be used here with whatever criteria for conformance you 
>>> want to indicate it with.
>>>
>>> Regards,
>>> Harsh
>>>
>>> On 02/10/2023 20:41, Mark Lizar wrote:
>>>> +1, this works well for notice signalling.
>>>>
>>>> And to extend what Beatriz mentions as for as status, active and 
>>>> informed. To this point has the  state of  the status been 
>>>> considered in modelling?
>>>>
>>>> E.g. Is the state of notice current, or not current, to indicate if 
>>>> privacy is as expected or not.
>>>>
>>>> Best,
>>>>
>>>> Mark
>>>>
>>>>
>>>>> On Oct 2, 2023, at 9:46 AM, beatriz.gesteves 
>>>>> <beatriz.gesteves@upm.es> wrote:
>>>>>
>>>>> Dear Delaram,
>>>>>
>>>>> I support the addition of these concepts.
>>>>>
>>>>> A question: since these concepts would be useful to use with other 
>>>>> types of entities/data subjects (e.g., data subject of type 
>>>>> dpv:Citizen is uninformed), already modelled in DPV, have you 
>>>>> considered modelling it as a status (similarly to other statuses 
>>>>> that we have in DPV e.g. activity statuses)? Or would the idea be 
>>>>> to use as many data subject types as needed based on the use case?
>>>>>
>>>>> Best,
>>>>>
>>>>> Beatriz
>>>>>
>>>>>
>>>>> On 02-10-2023 13:32, Arthit Suriyawongkul wrote:
>>>>>
>>>>>>
>>>>>>> On 2 Oct 2023, at 09:08, Delaram Golpayegani 
>>>>>>> <delaram.golpayegani@adaptcentre.ie> wrote:
>>>>>>>
>>>>>>> *Active Data Subject:* The data subjects who are aware of and 
>>>>>>> have given consent to collection and processing of their data, 
>>>>>>> e.g. an examinee sitting on an online exam proctored by an 
>>>>>>> AI-based system.
>>>>>>>
>>>>>>> *Passive Data Subject*: The data subjects who are not aware of 
>>>>>>> collection and processing of their data, e.g. a passenger, 
>>>>>>> passing the border control check, whose data is being processed 
>>>>>>> for migration monitoring.
>>>>>> Support the addition. Going to be very useful.
>>>>>>
>>>>>> "Not aware" may not fully cover the passiveness here. A passenger 
>>>>>> who has some knowledge about the border control (previous 
>>>>>> knowledge or reading a sign at the port) is aware of the collection.
>>>>>> From the example of online exam proctor and border control, one of 
>>>>>> the possible Active / Passive cutting points is probably whether 
>>>>>> during the data collection the data subject involve in the 
>>>>>> collection process directly. In the first example, the data 
>>>>>> subject can see the camera and knowingly that the camera is part 
>>>>>> of the exam process. They may also enter some personal data by 
>>>>>> themselves as well. Compare to the second example, where the data 
>>>>>> could be process well before the passenger enter the port (in case 
>>>>>> of an arranged travel that such the data is required by the 
>>>>>> regulation like air flight).
>>>>>> So I think the examples here will be more for Informed Data 
>>>>>> Subject and Uninformed Data Subject, as Harsh discussed the sense 
>>>>>> of #1 earlier.
>>>>>> Which would make us having six categories here? :
>>>>>> - Intended / Unintended
>>>>>> - Active / Passive
>>>>>> - Informed / Uninformed
>>>>>> Cheers,
>>>>>> Art
>>>>
>>>
>>
> 

-- 
---
Harshvardhan J. Pandit, Ph.D
Assistant Professor
ADAPT Centre, Dublin City University
https://harshp.com/

Received on Tuesday, 24 October 2023 10:49:05 UTC