- From: Piero Bonatti <pieroandrea.bonatti@unina.it>
- Date: Tue, 18 Jan 2022 17:50:57 +0100
- To: public-dpvcg@w3.org
Hallo Rinke, A few answers from my side: On 18/01/22 16:12, Hoekstra, Rinke (ELS-AMS) wrote: > Secondly, I do not understand the choice to model all of the categories > as classes. What are the intended instances of these classes? A few examples may clarify: The instances of class "Location" are specific points on the earth surface, e.g. expressed with coordinates. The instances of "mac address" are the concrete mac addresses of specific devices The instances of "email address" are the specific email addresses of the data subjects and so on The intuition is the following: When a privacy policy says that the data being processed is "location" then it says that the actual data crunched by the application may potentially involve any coordinates where the data subject may happen to be. When a privacy policy says that the data being processed is "mac address" then it says that the actual data crunched by the application may potentially involve the mac address of any device that the data subject may happen to use. This is indeed the kind of statements that a privacy policy is expected to contain - it would certainly not say "I'm going to to use only the location you are at now" or "just the mac address of the old notebook you use today and you are about to replace soon"... > > I can see a discussion related to this topic took place at the Nov 2020 > meeting [2], but the outcome seemed to be more around removing > domain/range restrictions so that the solution around the issue above, > as proposed by Victor (:wave:) in e.g. [3] gets hidden under the carpet > (Victor suggested that the range of e.g. dpv:hasProcessing is a blank > node that is an instance of dpv:Collect). Yes, that’s ugly [4], and I > agree with Rob’s suggestion here to use SKOS or instances and enumerated > classes. I think Harsh also supports this in his emails [5]. > > The arguments against this appear to be around inferencing, but I don’t > see what inferencing task is served by modeling these categories as classes. The inferencing task is *compliance checking* (of a privacy policy with respect to the consent of a data subject, or with the GDPR). with the class-based approach, each policy is simply the class of operations that it authorizes. Policy P complies with policy Q if and only if the class P is contained in the class Q (i.e. every operation authorized by P is also authorized by Q). You can use standard reasoners for compliance checking, and get correctness (no false positives) and completeness (no false negatives) for free, because the semantics of policies is exactly OWL2's direct semantics of classes. Note that "policy" here means any of: the privacy policy of the controller (or its record of processing); the consent of the user; (a formalization of) the objective fragment of the GDPR. So, with the above method, you can check compliance of the privacy policy with both consent and the GDPR and get strong "mathematical" guarantees on the reliability of the method. Differently, with the instance-based approaches (including those using blank nodes) any pair of different graphs are logically unrelated with each other. There is no correspondence between compliance checking and logical inferences over the RDF graphs, even if the graphs are logical theories in disguise. You have to define and justify an ad-hoc algorithm for compliance checking, and argue (how?) that it does the right thing and returns no wrong answers. Example: Suppose a consent statement authorizes some processing of the data subject's "account identifier"s, while the privacy policy says that the data being processed are "financial account number"s. you would expect the privacy policy to comply with that consent because "financial account number" is a special case of "account identifier", so consent "covers" the privacy policy. This is what you actually get if "account identifiers" and "financial account number" are classes. However, if "account identifiers" and "financial account number" are instances and the policies are two RDF graphs (i.e. instances themselves), then the two RDF graphs/policies have no logical relationships with each other, and you can't use RDF semantics to tell whether the privacy policy is compliant. You have to re-invent, justify, and validate a compliance checking method from scratch, without any linkage to RDF's semantics (and without any support from it). > > For instance, if I look at the Primer [1] (don’t know how up-to-date > this is), there is an example about AcmeMarketing: > > ex:AcmeMarketing a dpv:PersonalDataHandling ; > > dpv:hasPersonalDataCategory dpv:EmailAddress ; > > dpv:hasProcessing dpv:Collect, dpv:Use ; > > dpv:hasPurpose dpv:Marketing ; > > dpv:hasDataController ex:Acme . The above is a natural example of a policy that needs both classes and instances. Unfortunately, RDF can't clearly say which is a class and which is an instance. With the class-based approach, instance-valued properties can be expressed with singleton classes, when needed, i.e. one can say that the data controller belongs to the class that contains only Acme (in OWL2 this is expressed with ObjectOneOf( Acme)). This is equivalent to saying that the data controller is precisely Acme. In this way you get full expressiveness, i.e. the advantages of both classes and instances. At the same time - by using larger classes - you can model joint data controllers, or you can give a same consent to a class of related controllers (these are just two examples of the possible use of general classes as data controller specifications). If you are interested in a simple and natural JSON syntax that supports both classes and instances, please see: Piero A. Bonatti, Luigi Sauro, Jonathan Langens: Representing Consent and Policies for Compliance. EuroS&P Workshops 2021: 283-291 Such JSON dialect is just a handy "external" representation for OWL2 classes, and gives you all the expressiveness you need, in a developer-friendly way. Regards, Piero
Received on Tuesday, 18 January 2022 16:51:47 UTC