- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Fri, 26 May 2023 22:46:20 +0100
- To: "public-dpvcg@w3.org" <public-dpvcg@w3.org>
Hi. Please see below the proposal for data breach concepts based on discussions so far. The proposal is to add a data breach extension with IRI https://w3id.org/dpv/databreach (which is indicated using db: below). The prefix ex: represents example concepts that the adopter or implementor will create. Note: Security reporting requires strict and clear information i.e. to be explicit in cases where something is or isn't applicable. Due to the open world assumption of semantic web, where if information is not specified then it can mean it is not applicable or not available or something else - this ambiguity is problematic for cases such as data breach reporting. To assist with this, we should add special instances in DPV to assist with explicitly indicating information and to address the ambiguity arising from "open world" assumptions. These are grouped under the status `InformationAvailability` and can be used anywhere: 1) dpv:UnknownApplicability - a special instance indicating information is unknown i.e. it is not known if the information exists or is applicable and therefore statements about its availability cannot be made (yet) 2) dpv:NotApplicable - a special instance indicating context is not applicable 3) dpv:NotAvailale - a special instance indicating the information is applicable but is not yet available -- Data Breach extension -- Data Breach information has six parts. Part 1) Information about the Data Breach itself i.e. when did it occur, who was the cause, etc. This is represented with the concept `DataBreach` Part 2) Investigation of the Data Breach i.e. report specifying when it was detected, what has been affected, who has it been reported to. This is represented using the concept `DataBreachReport` with specific sub-types. Part 3) Notifications i.e. communication about the breach between entities, e.g. Controller to DPA. This is represented using the concept `DataBreachNotification`. Part 4) Impact Assessment that assesses the risks and impacts based on available information. This is represented using the concept `DataBreachImpactAssessment`. Part 5) Breach Mitigation assessment which identifes the causes of the breach and specifes what changes were made to prevent further breaches or mitigate consequences of existing breach. This is represented using the concept `DataBreachMitigationAssessment`. Part 6) Details about the organisation, including its establishments and lead authorities for handling the data breach. -- DataBreach -- Data Breach is an event, which means it has temporal properties. These are indicated using dct:temporal which can specify the start (if known) and end (if known). If both are not available, then use dpv:NotAvailale. The 'source' of a breach is represented using the concept `dpv:RiskSource` which is also a specific event whose existence led to the breach taking place. For example, an employee left their desk unattended without locking it down. In addition, a `risk:Vulnerability` can be specified to indicate some weakness was exploited. This is not necessary, for example some accident can occur without it being a vulnerability. The 'cause' i.e. actor of the breach, whether intentional or unintentional, and regardless of malicious intent or accidents, is represented by `dpv:ThreatActor`. In the above case, the threat actor is whomever accessed the system after the employee left - this could be another employee, a cleaner, or someone else. The existence of an actor is not a necessity for a breach to take place, for example, a disk drive containing sensitive data being thrown away is a data breach regardless if someone manages to collect it or not. The 'status' of a data breach refers to whether the breach has been concluded or is still ongoing. This is represented by `DataBreachStatus` with possible values `DataBreachStatusUnknown`, `DataBreachOngoing`, `DataBreachHalted`, and `DataBreachConcluded`, `DataBreachTerminated`, and `DataBreachMitigated`. Here, halted refers to a breach being stopped but with uncertainty regarding it being resumed (which makes it ongoing again). Concluded means it has finished (on its own), Terminated means the actions of the entity have caused it to be stopped, and Mitigated means it has also been prevented from happening again. The 'type' of a data breach refers to categorisation. The DPAs suggest three categories - confidentiality breach where authorisation has failed, integrity breach where data has been modified, and availability breach where data access has been compromised. These explanations are simplified, and their origin is in information security (CIA model). Specifying the type of breach is important as it represents what 'definition' of a breach should be interpreted and informs the follow-up investigations and reporting. The typical notion of a data breach only occuring when someone else has access to data is only one of these definitions. The type of breach is indicated using `rdf:type` with values `db:ConfidentialityBreach`, `db:AvailabilityBreach`, and `db:IntegrityBreach`. A data breach can have several identifiers - for example those within an organisation, across organisations, or in correspondence with authorities. To indicate these, the concept `db:DataBreachIdentifier` is provided that is used with `dpv:hasIdentifier`. The separate concept allows further categorisation of identifiers, and more importantly to distinguish who has provided which identifier, e.g. using `dct:creator` and `dct:publisher`. To indicate speculative information about data breaches, e.g. for planning and risk assessment purposes, DataBreach is defined as a subclass of dpv:Risk, which means we can specify `dpv:hasRisk db:DataBreach` and `dpv:isMitigatedByMeasure` to indicate potential breaches and how they have been addressed. Examples: ex:Incident1A a db:DataBreach ; rdf:type db:ConfidentialityBreach ; # type of breach dct:temporal db:Unknown ; # start and end are unknown dpv:hasRiskSource db:Unknown ; # what caused the breach dpv:hasThreatActor db:Unknown ; # who caused the breach dpv:hasStatus db:DataBreachOngoing . # status of breach ex:Incident1B a db:DataBreach ; rdf:type db:IntegrityBreach ; # type of breach dct:temporal "2023-05-24/2023-05-26" ; # start and end dpv:hasRiskSource ex:LackOfSecurityTraining ; # what caused the breach dpv:hasVulnerability ex:WeakAuthentication ; # what failed or went wrong dpv:hasThreatActor ex:MaliciousHacker ; # who caused the breach dpv:hasStatus db:DataBreachConcluded . # status of breach ex:WeakAuthentication a risk:Vulnerability ; dpv:isImplementedUsingTechnology ex:Software ; # vulnerability of what dpv:isImplementedByEntity ex:Processor . # entity where breach took place -- Data Breach Detection Report -- The concept `db:DataBreachDetectionReport` specifies the reporting of a data breach being detected, along with any pertinent details about the detection itself. This is necessary to separate from the information about a data breach from when it was detected, for example to denote when an entity became 'aware' of the breach as separate from the temporal properties of the breach itself. As there can be multiple entities involved in a breach, e.g. processor and controller, they will each have their own detection report. The usual DCMI properties can be utilised here, e.g. `dct:subject` to indicate which data breach is the subject of this report, `dct:created` to indicate when the report was created - and hence when was a breach first 'detected', and `dct:creator` to indicate who created the report. To further report updates, `dct:modified` is available to indicate further changes. To indicate the source of information, for example in connection with who reported the breach, the property `dpv:hasDataSource` should be used, e.g. with values Employees, specific Data Subject, link to a news item, etc. To specify any communications providing information about the breach (detection), the property `dpv:hasNotice` should be used. This can be incoming information (entity is recipient) or outgoing (entity is sender). To specify information contents of a notice as a form of communication, `schema:Message` can be used. To report on the status of detection (as a form of investigation), the existing `dpv:ActivityStatus` concepts can be used. Examples: ex:IncidentReport2A a db:DataBreachDetectionReport ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasDataSource ex:Employee ; # breach was reported by an employee dpv:hasDataSource <https://nytimes.com> ; # breach was reported in a news dpv:hasDataSource ex:Processor ; # breach was reported by a Processor dpv:hasActivityStatus dpv:ActivityCompleted . # status of the detection reporting ex:IncidentReport2B a db:DataBreachDetectionReport ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasDataSource ex:Processor ; # breach was reported by a Processor dpv:hasNotice ex:ProcessorReportsBreach ; # breach info sent by Processor dpv:hasNotice ex:ReportedBreachToAuthority ; # breach reported to DPA dpv:hasActivityStatus dpv:ActivityCompleted . # status of the detection reporting ex:ProcessorReportsBreach a db:DataBreachNotice, schema:Message ; dct:subject ex:Incident1A ; schema:dateReceived "2023-05-24" ; # when the message was sent schema:sender ex:Processor ; # who sent it schema:recipient ex:Controller ; # who received it schema:messageAttachment <report.pdf> . # what were the contents ex:ReportedBreachToAuthority a db:DataBreachNotice, schema:Message ; dct:subject ex:Incident1A ; schema:dateSent "2023-05-24" ; schema:sender ex:Controller ; schema:recipient ex:DPC ; schema:messageAttachment <report.pdf> ; dpv:hasJustification [ # justification if reporting is >72 hours a db:DataBreachDelayedReportingJustification ; dct:description "here's what actually happened ..."@en ; ] . -- Data Breach Investigation Report -- Following from detection, a preliminary investigation report needs to be drafted for cases when the breach has to be reported, e.g. within 72 hours. This is represented by `db:DataBreachPreliminaryReport`, which is a subclass of `db:DataBreachReport`. As with `db:DataBreachDetectionReport`, the properties of `dct:subject`, `dct:created`, and `dct:creator` are applicable here. In a preliminary investigation report, more details are expected to be present than at the time of detection. Other subclasses of `db:DataBreachReport` are associated with the stage of investigation, represented by `db:DataBreachOngoingReport` and `db:DataBreachConcludingReport`. These can be tied together into a common group through an instance of `db:DataBreachReport`, for example by using `dct:hasPart`. What personal data has been affected, indicated using `dpv:hasPersonalData` and specific categories of data, including `dpv:SpecialCategoryPersonalData`. To indicate the scale of data, `dpv:hasDataVolume` should be used with a qualifier, e.g. `dpv:HugeDataVolume`, and a quantifier using `dpv:hasDataVolume` to indicate the actual number of data records affected. To indicate data subjects affected, similarly `dpv:hasDataSubject`, `dpv:hasDataSubjectScale`, and `dpv:hasDataSubjectScale` are to be used. In addition to these, DPAs require reporting whether there has been any cross-border context to the data breach including data subjects or processing activities in multiple member states. To indicate data subjects are from specific jurisdictions, the property `dpv:hasJurisdiction` should be used. To indicate processing activities, similarly `dpv:hasProcessing` and `dpv:hasProcessingScale`, `dpv:hasJurisdiction` can be used. To indicate what technologies were involved, we have `dpv:isImplementedUsingTechnology`, and to specify who implemented it - `dpv:isImplementedByEntity`. The personal data, data subjects, processing, and other details can be grouped using `dpv:PersonalDataHandling` to indicate separation, such as for jurisdictions affected, or technologies affected. The granularity of this information is unbound as the 'graph' of what was affected can be as large or small as required. From the reporting forms, in most cases it is sufficient to indicate the abstract or summary information at this stage. To indicate risks we have `dpv:hasRisk`, for consequence we have `dpv:hasConsequence`, and for impact we have `dpv:hasImpact` - and their affected entity variants. These can be specified directly, or through an impact assessment represented by `dpv:DataBreachImpactAssessment` and `dpv:hasImpactAssessment`. Note that the concept of impact assessment here refers to impact on or for data subjects - the internal assessment of impact (e.g. loss of business) is not part of this impact assessment and could be represented separately through a risk assessment. The data breach reporting requires explicitly indicating whether there will be an impact to fundamental rights - most commonly as a Yes / No option. To indicate this, we need the concept `risk:ImpactOnFundamentalRights` in the risk vocabulary, which is expressed using `dpv:hasImpact`. To express whether this will take place or not, the `risk:hasLikelihood` should be used. I do not think there can ever be a likelihood of 0 - so the lowest value is extremely unlikely, which is 0.1 (or something akin to it). If this information is not available, then `dpv:NotAvailale` should be used. Data breach reporting requires information on what technical and organisational measures were in place before the breach, deficiencies identified, and any changes or additional measures taken to address the breach. To specify the affected TOMs, we have `dpv:hasTechincalMeasure` and `dpv:hasOrganisationalMeasure`. To specify limitations or failure, we have `risk:hasVulnerability` which together with `dpv:isMitigatedByMeasure` and `db:DataBreachMitigationMeasure` indicates that the data breach has been addressed. Examples: ex:IncidentReport3A a db:DataBreachPreliminaryReport ; rdfs:comment "services/processes affected by the breach" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasDataSource ex:Processor ; # breach was reported by a Processor dpv:hasNotice ex:ReportedBreachToAuthority ; # breach reported to DPA dpv:hasConsequenceOn ex:ServiceA, ex:PDH9 ; # services affected dpv:hasConsequenceOn db:NotAvailale ; # further consequences not known yet dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3B a db:DataBreachPreliminaryReport ; rdfs:comment "data, subjects, processing, technologies affected" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasPersonalData dpv:SpecialCategoryPersonalData ; # special category dpv:hasDataVolume dpv:HugeDataVolume ; dpv:hasDataSubject dpv:VulnerableDataSubject ; # vulnerable subjects; dpv:hasDataSubjectScale dpv:LargeScaleOfDataSubjects ; dpv:hasProcessing dpv:Store ; dpv:hasProcessingScale dpv:LargeScaleProcessing ; dpv:hasJurisdiction legal:IE, legal:FR, legal:DE ; dpv:isImplementedUsingTechnology ex:Software ; dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3C a db:DataBreachPreliminaryReport ; rdfs:comment "indicating jurisdiction of data subjects and processing" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasPersonalData [ rdfs:subClassOf dpv:SpecialCategoryPersonalData ; dpv:hasDataVolume dpv:HugeDataVolume ; dpv:hasDataVolume "300000" ; # records affected ] . dpv:hasDataSubject [ rdfs:subClassOf dpv:VulnerableDataSubject ; dpv:hasJurisdiction legal:IE, legal:FR, legal:DE ; dpv:hasDataSubjectScale dpv:LargeScaleOfDataSubjects ; dpv:hasDataSubjectScale "300000" ; # people affected ] . dpv:hasProcessing [ rdf:type dpv:Store ; dpv:hasJurisdiction legal:IE, legal:FR, legal:DE ; dpv:hasProcessingScale dpv:LargeScaleProcessing ; ] . dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3D a db:DataBreachPreliminaryReport ; rdfs:comment "grouping using personal data handling" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasPersonalDataHandling [ a dpv:PersonalDataHandling ; dpv:hasService ex:ServiceA ; dpv:hasPersonalData dpv:SpecialCategoryPersonalData ; dpv:hasDataVolume dpv:HugeDataVolume ; dpv:hasDataVolume "300000" ; dpv:hasDataSubject dpv:VulnerableDataSubject ; dpv:hasJurisdiction legal:IE, legal:FR, legal:DE ; dpv:hasDataSubjectScale dpv:LargeScaleOfDataSubjects ; dpv:hasDataSubjectScale "300000" ; dpv:hasProcessing dpv:Store ; dpv:hasProcessingScale dpv:LargeScaleProcessing ; dpv:isImplementedByEntity ex:Processor ; ] ; dpv:hasPersonalDataHandling [ a dpv:PersonalDataHandling ; dpv:hasService ex:ServiceB ; dpv:hasPersonalData dpv:Email ; dpv:hasDataVolume dpv:SmallDataVolume ; dpv:hasDataSubject dpv:User ; dpv:hasDataSubjectScale dpv:SmallDataSubjectScale ; dpv:isImplementedByEntity ex:Controller ; ] ; dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3E a db:DataBreachPreliminaryReport ; rdfs:comment "indicating risks and impacts" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasRisk [ rdf:type risk:RiskToFundamentalRights ; dpv:hasLikelihood risk:ExtremelyLowLikelihood ; ] ; dpv:hasImpactAssessment ex:SomeImpactAssessment ; dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3F a db:DataBreachPreliminaryReport ; rdfs:comment "indicating risks and impacts separately for groups" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasPersonalDataHandling [ a dpv:PersonalDataHandling ; dpv:hasService ex:ServiceA ; dpv:hasPersonalData dpv:SpecialCategoryPersonalData ; dpv:hasDataSubject dpv:VulnerableDataSubject ; dpv:hasRisk [ rdf:type risk:RiskToFundamentalRights ; dpv:hasLikelihood risk:HighLikelihood ; ] ; dpv:hasImpactAssessment ex:SomeImpactAssessment ; ] ; dpv:hasPersonalDataHandling [ a dpv:PersonalDataHandling ; dpv:hasService ex:ServiceB ; dpv:hasPersonalData dpv:Email ; dpv:hasDataVolume dpv:SmallDataVolume ; dpv:hasRisk [ rdf:type risk:RiskToFundamentalRights ; dpv:hasLikelihood risk:ExtremelyLowLikelihood ; ] ; dpv:hasImpactAssessment ex:SomeImpactAssessment ; ] ; dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:IncidentReport3G a db:DataBreachPreliminaryReport ; rdfs:comment "indicating measures taken in response to the breach" ; dct:subject ex:Incident1A ; # which data breach this report refers to dct:created "2023-05-26T14:38:00" ; # when this report was created dct:creator ex:CompanyAlpha ; # who created the report dpv:hasTechincalMeasure [ a dpv:PasswordAuthentication ; risk:hasVulnerability ex:WeakAuthentication ; ] ; dpv:hasTechincalMeasure [ a dpv:CryptographicAuthentication, db:DataBreachMitigationMeasure ; risk:mitigatesVulnerability ex:WeakAuthentication ; ] ; dpv:hasOrganisationalMeasure [ a dpv:Training, db:DataBreachMitigationMeasure ; risk:mitigatesRiskSource ex:LackOfSecurityTraining ; ] ; dpv:hasActivityStatus dpv:ActivityCompleted . # status of report ex:DataBreach a dpv:Risk ; risk:hasRiskSource ex:LackOfSecurityTraining . ex:LackOfSecurityTraining a risk:RiskSource ; risk:hasVulnerability ex:WeakAuthentication . ex:WeakAuthentication a risk:Vulnerability ; risk:isVulnerabilityOf ex:Software . -- Data Breach Notifications -- The earlier example showed messages being passed to inform about data breaches, for examples from processor to controller, and from controller to the DPA. The message contents are expected to be accompanied with a report of appropriate form, e.g. for detection, preliminary or final investigation - containing the impact assessment. Other than these, there is also a notification to the data subjects that must be represented. This is also done using the same concepts. Communications to authorities are represented using `db:AuthorityDataBreachNotice`. Those from processors to controllers (or other processors) are represented using `db:ProcessorDataBreachNotice`. Those coming from controllers (to authority or other controllers) are represented using `db:ControllerDataBreachNotice`. The difference in notifications to the data subjects is that there may be actions to be taken, for example to safeguard themselves against adverse impacts. To represent these, the concept `db:DataSubjectBreachNotice` is to be used. To indicate actions to be taken by the data subjects, the simplest representation would be to indicate `dpv:hasRisk` or `dpv:hasImpact`, along with `dpv:Likelihood` and `dpv:Severity`, and then using `dpv:isMitigatedByMeasure` to indicate what can be done. To indicate the medium of the notification, `dct:medium` should be used. Other DCMI concepts such as `dct:format`, `dct:instructionalMethod`, `dct:language` can also be used. To indicate whether some notifications are being planned, are ongoing, or have been completed - `dpv:ActivityStatus` can be used. Examples: ex:ProcessorReportsBreach a db:ProcessorDataBreachNotice, schema:Message ; rdfs:comment "Processor to Controller" ; dct:subject ex:Incident1A ; schema:dateReceived "2023-05-24" ; # when the message was sent schema:sender ex:Processor ; # who sent it schema:recipient ex:Controller ; # who received it schema:messageAttachment <report.pdf> . # what were the contents ex:ControllerReportsBreach a db:AuthorityDataBreachNotice, schema:Message ; rdfs:comment "Controller to Authority" ; dct:subject ex:Incident1A ; schema:dateSent "2023-05-24" ; # when the message was sent schema:sender ex:Controller ; # who sent it schema:recipient ex:DPA-IE ; # who received it schema:messageAttachment <report.pdf> . # what were the contents ex:DataSubjectNotified a db:AuthorityDataBreachNotice, schema:Message ; rdfs:comment "Controller to Data Subjects" ; dct:subject ex:Incident1A ; # can provide all or some info about breach schema:dateSent "2023-05-24" ; # when the message was sent dct:temporal "<start>/<end>" ; # period in which the messages were sent schema:sender ex:Controller ; # who sent it schema:recipient dpv:User ; # all users received this schema:recipient db:BreachedDataSubjects ; # only breach affected subjects dct:medium "website", "public announcement" ; # notifications provided dpv:hasPersonalData pd:Email ; # what data has been breached dpv:hasImpact [ a risk:Fraud ; risk:hasLikelihood risk:HighLikelihood ; dpv:isMitigatedByMeasure "check email authenticity" ; dct:instructionalMethod "link to educate about emails" ; ] dpv:hasActivityStatus dpv:ActivityProposed ; schema:messageAttachment <report.pdf> . # what were the contents -- Organisation Details -- Reporting a breach requires information about the organisation. The DPC form asks for name (`dpv:hasName`), address (`dpv:hasAddress`), whether there are EU/EEA establishments, sector (Public, Private, Charity, Voluntary), sub-sector (NACE taxonomy), and internal ID for breach. Of these, the internal breach ID should be specified using `db:DataBreachIdentifier` and `dct:publisher` as indicated earlier. Of the rest, sector (e.g. public) is represented as a type of organisation (`dpv:PublicOrganisation`). The sub-sector is represented using `dpv:hasSector` and the NACE code. For EU/EEA establishments, we add the concept `dpv:Establishment` and use this along with `dpv:isMainEstablishmentFor` to indicate the jurisdiction, or use `dpv:isEstablishmentOf` to indicate parent company. To indicate the Processor, use `dpv:hasDataProcessor`. Examples: ex:Controller a dpv:DataController ; a dpv:PrivateOrganisation ; dpv:hasSector "<NACE>" ; dpv:hasDataProcessor ex:Processor ; dpv:isEstablishmentOf ex:ParentController ; dpv:isMainEstablishmentFor legal:FR . # France ex:ParentController a dpv:DataController ; a dpv:PrivateOrganisation ; dpv:hasEstablishment ex:Controller ; dpv:isMainEstablishmentFor legal:EU . -- Next Steps -- Go through the data breach concepts again - see if there is anything unclear or missing. If so - fix that. If not, proceed to create some examples based on well-known or well-studied cases to ensure that the above structure and proposal works. The concepts about Risk and Vulnerability seem somewhat unclear - that is because they have not been added yet. Based on whether they work in the above context - we will either add or refine them. Regards, Harsh On 27/03/2023 12:55, Harshvardhan J. Pandit wrote: > Hi. The below set of concepts are based on analysis by myself and Georg, > and what we have been discussing in the group so far. -- --- Harshvardhan J. Pandit, Ph.D Assistant Professor ADAPT Centre, Dublin City University https://harshp.com/
Received on Friday, 26 May 2023 21:46:32 UTC