- From: Piero Bonatti <pieroandrea.bonatti@unina.it>
- Date: Mon, 27 Apr 2020 17:38:33 +0200
- To: public-dpvcg@w3.org
Dear all, during the last call I have been asked to resume the discussion on the best way of encoding consent and data requests. Below you can find a list of 4 possible approaches, with some pros and cons, discussed with the goal of automated compliance checking in mind. Please comment on the alternatives, share your preferences, and point out possible drawbacks (including non-technical aspects, that I deliberately leave out of the list). And, of course, feel free to suggest your own approach. For your convenience, I have also included the examples from the previous messages. Best regards Piero PS: my personal preference so far is for approach 3 below, that in my opinion is the most uniform and clean of all four. -------------------------------- APPROACH 1 This is the approach circulated in previous messages. The specific example consists of a consent and a data request, encoded in RDFS as follows: ex:consentPatient1 a dpv:Consent ; dpv:hasDataSubject ex:patient1 ; dpv:hasPurpose [a dpv:AcademicResearch]; dpv:hasProcessing [a dpv:Collect]; dcterms:title "Consent for Health data analysis in a clinical study ..." ; dpv:hasDataController ex:hospital1; dpv:haRecipient ex:physiotherapist1; dpv:hasPersonalDataCategory [a dpv:PhysicalHealth]. ex:dataRequest a dpv:PersonalDataHandling ;tell us dpv:hasDataSubject ex:patient1 ; dpv:hasPurpose [a dpv:AcacemicResearch] ; dpv:hasProcessing [a dpv:Collect]; dpv:hasLegalBasis [a dpv:Consent]; dpv:hasDataController ex:hospital1; dpv:haRecipient ex:physician3; dpv:hasPersonalDataCategory [a dpv:PhysicalHealth]; dcterms:title "Personal Data Collection for clinical study ..." The main drawback of this approach is that ex:consentPatient1 says (in English) that ex:patient1 consents to some processing, for some purpose, over some data category, that are all unspecified, because they are expressed with blank nodes. Consequently, consent and data request are logically unrelated, because the blank nodes in the consent and those in the data request may denote different individuals. Thus compliance checking cannot be reduced to any form of logical reasoning between the two graphs. In order to check compliance, one needs an ad-hoc notion of matching (that must be justified for correctness and completeness from scratch). It is not clear whether the ad-hoc matching algorithm can be implemented on top of the standard reasoning tools. The above problem can be solved by making consent a *class* of objects; then compliance can be reduced to checking whether the data request is contained in the consent - which can be reduced to standard reasoning tasks, see below. APPROACHES 2, 3 In these two approaches, consent is an OWL2 class. Among the standard alternative syntax of OWL2, Manchester syntax is probably the simplest so far. In Manchester syntax a consent class would look like this: (hasDataSubject some {ex:patient1} and (hasPurpose some AcacemicResearch) and (hasPersonalDataCategory some PhysicalHealth) and (hasProcessing some Collect) and (hasRecipient some {ex:physician3}) ...) The above expression covers the class of *all* processing activities of type Collect (no matter how data is concretely collected), on some physical health data (it may involve blood pressure, heartbeat frequency, etc), for the purpose of some kind of academic research (be it medical, biological, ...), whose results are shared with x:physician3. Which is what a direct translation into English would say. Manchester syntax is general enough to cover all OWL constructs; for compliance checking a more streamlined JSON-like syntax may be enough, e.g.: { hasDataSubject: {ex:patient1} hasPurpose: AcacemicResearch hasPersonalDataCategory: PhysicalHealth hasRecipient: ... } Such syntax only needs a well-specified mapping into OWL2 that gives it a formal semantics and a logical meaning. Now approaches 2 and 3 differ in the representation of data requests. In APPROACH 2, data requests are still expressed as RDFS nodes (as in APPROACH 1). Then compliance checking can be reduced to instance checking (i.e. whether the data request is an instance of consent). In APPROACH 3, data requests are expressed as classes, with the same syntax as consent. In this case, compliance checking can be reduced to subsumption (i.e. checking whether the data request class is contained in the consent class). APPROACH 4 A class may also be expressed as a SPARQL query (the answer is the class). Data requests are as in approaches 1 and 2. The above consent could be expressed as a SPARQL query selecting all objects with hasDataSubject=ex:patient1, hasPurpose in AcacemicResearch, etc. ex:dataRequest is compliant iff it belongs to the query answer. My personal feeling is that expressing consent via a SPARQL query introduces lots of irrelevant stuff and is too operational.
Received on Monday, 27 April 2020 15:42:47 UTC