Re: Feedback on DPV Primer's draft from Piero Bonatti on 2022-01-31 (public-dpvcg@w3.org from January 2022)

From: Piero Bonatti <pieroandrea.bonatti@unina.it>
Date: Mon, 31 Jan 2022 18:07:49 +0100
To: "Harshvardhan J. Pandit" <me@harshp.com>
Cc: "public-dpvcg@w3.org" <public-dpvcg@w3.org>, "luigi.sauro@unina.it" <luigi.sauro@unina.it>
Message-ID: <12bdad94-9f80-ba54-c829-255e0d42e02e@unina.it>
Hi Harsh,

Your reply sounds like a help request :-)

As I anticipated to you time ago, we (Luigi Sauro and I) had already 
planned to provide an OWL specification of DPV. We can easily add the 
profile's specification (5 lines or so - yes, it is this simple!) and 
the guidelines on how to extend DPV with new concepts. All in a week or 
so, as you need.

For this purpose we need two inputs from you:

- when is the actual deadline for the contribution?
- where should we put our contribution? (online or in a document, in 
which format,...)

Best

Piero

On 31/01/22 12:36, Harshvardhan J. Pandit wrote:
> Hi Piero. Thanks for your (also) prompt reply! My comments are inline.
> btw; this email was not sent to the mailing list, but I've added it. I 
> hope you don't mind.
> 
> tldr; We need to sort of agree on a compromise here, otherwise we both 
> could keep discussing this for a long time. I, as the chair, have to 
> make a decision this week or next. I, as the only person who's writing 
> documentation, also needs to make the decision to write examples and 
> use-cases. This also needs to be done this week or next. Neither of this 
> are by my choice - no one else has volunteered. So I won't delay these 
> any further since my aim is to get DPV to v1 this year.
> 
> I, personally, don't have the time to do the kind of OWL profile 
> specification and validation/profile-checker that you are indicating. I 
> know TRAPEZE is doing this. So unless you or TRAPEZE colleagues can do 
> this right away, I don't see how it can be done in time.
> 
> On 31/01/2022 10:03, Piero Bonatti wrote:
> 
>>
>> - The use of RDFS/JSON-LD is motivated with the annotation use case. 
>> In this context, I said that - as an alternative to SKOS -  one might 
>> have used a meta-version of DPV properties in order to link a resource 
>> to a DPV class in RDFS/JSON-LD.
>>    For example, using the duplication approach, hasPersonalData would 
>> have a corresponding meta-property hasPersonalDataClass that ranges 
>> over a (meta)class "PersonalDataClasses", that in turn contains (as 
>> instances) PersonalData and its subclasses. You can define such 
>> metaclasses with RDFS and JSON-LD.  Now, for example, given a resource 
>> R, you can annotate it with - say:
>>                "R hasPersonaDataClass Location"
>> to assert that R contains location data.
>>
>> The range of such meta-level properties are meta-classes (like 
>> PersonalDataClasses) whose instances are DPV concepts.  This resolves 
>> contradiction and ambiguities by keeping object-level and meta-level 
>> cleanly separate. I don't seem that this is the same as using SKOS, as 
>> in SKOS the two layers (object-level and meta-level) are not cleanly 
>> separated, nor precisely related.
> 
> I really think this is 'ugly' to have separate properties like that. I 
> haven't seen this done anywhere else either. This also requires 
> explaining what is a 'class' and an 'instance', and its nuances, and 
> what happens if these are mixed - to the adopter.
> 
> Further, there is a really good chance that what one graph may think of 
> as an instance, another one may want it to be a class. For example, 
> Country as instance in one graph (controller location), and class in 
> another (data storage locations for servers). We can create 
> 'guidelines', but I think we'll be asking a lot from users if they need 
> to re-evaluate their entire data model every time a concept is 
> added/changes.
> 
>>
>> - The "vague" semantics of SKOS affects interoperability not only with 
>> respect to the use cases that involve reasoning, but with respect to 
>> all use cases.  What is vague remains vague, no matter how it is 
>> used.  The example of reasoners treating a same SKOS graph in 
>> different ways can be easily generalized to other applications.
> 
> With OWL and SKOS, we get the same interoperability in case of DPV. Both 
> are sufficient to express whether a concept is "part of" another concept 
> (as a set). That's all that is needed at the moment IMHO.
> 
> This is considering (and I posit) that not all use-cases would even 
> involve reasoning beyond this. For example, when I work with ROPA 
> (register of processing activitiy) or Consent requests, I need a 
> vocabulary to represent the terms, and they may not end up being 
> semantic-web in the end. The only inference/reasoning involved is as a 
> hierarchy.
> 
> The spectrum for DPV's possible application ranges from annotate/tag to 
> doing logic-based compliance checking (e.g. DAPRECO KG). It is 
> impossible to support all of it. Instead, I'm trying to make it as 
> simple as possible to reuse/convert DPV to support any of these.
> 
>>
>> - I have to insist that OWL2-PL's knowledge bases/graphs are indeed 
>> simpler than SKOS graphs because the former contain statements like:
>>      * term A is a subclass of term B
>>      * term A is an instance of term B
>>      * the domain/range of property P is term A
>>      * classes A and B have no instances in common.
>> Moreover, a user who wants to add a new personal data category or a 
>> new purpose needs almost only the subclass statement (if TRAPEZE's 
>> guidelines are followed).  Understanding and extending SKOS assertions 
>> needs more work.
> 
> I have to disagree with this. SKOS assertions are simpler than OWL or 
> TRAPEZE ones. With SKOS, we have:
> 
> term T(op level concept) is an instance of skos:ConceptScheme and 
> owl:Class .
> term A is an instance of term T
> term A is broader/narrower than some other concept (if needed)
> domain/range of property P is term T
> 
> This allows for things like:
> :MyEmail a dpv:PersonalData .
> :MyEmailUsagePatterns a dpv:PersonalData ; skos:broader :MyEmail .
> :X dpv:hasPersonalData :MyEmail, :MyEmailUsagePatterns .
> 
> If someone wants to extend this, they just need the broader/narrower 
> relations.
> 
> :MyEmailUsagePatternsOnSaturday a dpv:PersonalData ;
>      skos:broader :MyEmailUsagePatterns.
> 
> In OWL, one has to do it like this:
> :MyEmail rdfs:subClassOf dpv:PersonalData ;
>      a dpv:PersonalData .
> :MyEmailUsagePatterns a :MyEmail .
> :X dpv:hasPersonalData :MyEmail, :MyEmailUsagePatterns .
> 
> Then to extend it, one needs to assert MyEmailUsagePatterns as a class now.
> :MyEmailUsagePatterns rdfs:subClassOf :MyEmail .
> :MyEmailUsagePatternsOnSaturday a :MyEmailUsagePatterns .
> 
> So the OWL one constantly asks someone to evaluate between 
> classes/instances whereas the SKOS one doesn't. One could say use only 
> sub-classes and no instances, but then why pretend this is different 
> from the SKOS model? They are both doing the same, expressing 
> parent-child style relationships.
> 
> Additionally with OWL, you have to explain when to use subclasses or why 
> not to use instances. Because one could 'not follow' the guidelines. 
> Whereas with SKOS, all you need to say is: find a concept in the 
> hierarchy closest to the one you have, extend it using skos:broader. 
> That's it. Nothing else to think about. And there's no other way to do 
> this which makes not following "guidelines" more difficult.
> 
> And with SKOS, we get better 'vocabulary' management because I can 
> create ad-hoc taxonomies that work with both SKOS and OWL, and that maps 
> really simply with anything else that requries a taxonomy. For example, 
> creating 'Data Transfer Legal Basis' as a separate taxonomy is as 
> trivial as declaring a concept scheme and throwing all related legal 
> bases under it. With OWL, one has to craft sub-classes for new concepts, 
> and re-arrange the entire legal bases taxonomy when one concept is 
> added. This means DPV will fluctuate a lot between versions. Not 
> desirable IMO.
> 
>>
>> - Profiles are well-defined and checked syntactically, so your 
>> statement "the moment we say follow OWL2 - then the adopter is free to 
>> use any and all of OWL2 semantics" is ungrounded.  Only the assertions 
>> supported by the profile will be accepted, the others shall be treated 
>> like syntax errors. [please note that this is a fully standard 
>> approach: the OWL APIs themselves support profile definition and 
>> parsing].
> 
> Okay. First, this necessitates creating a separate profile. Who will 
> create these profile specifications and linters/checkers for their 
> expression? Then these need to be kept up to date as things change. 
> There is documentation to be created, use-cases to be written. It is a 
> lot of work!
> 
> Right now in DPVCG we don't even have a lot of involvement from people 
> for discussing and refining the concepts. Its just me working on a lot 
> of stuff, and this is after-hours work, not even as part of my regular 
> job. So I consider working on more specifications and profile checkers 
> as nice to have, but not a priority if there are no person months 
> available to get them done.
> 
>>
>> - In the light of the above point, TRAPEZE's guidelines are actually 
>> and effectively going to remove all complications.
> 
> Maybe for TRAPEZE use-cases. But I don't think they work with other 
> use-cases where the same requirements are not present. As in SPECIAL, 
> TRAPEZE has a strict set of aims for what it wants to do with DPV. But 
> there are others who also want to use DPV, and I re-iterate that this 
> does not always involve the kind of profile checking you do within 
> SPECIAL/TRAPEZE.
> 
>>
>> I am so confident about the greater usability of TRAPEZE's framework 
>> that I'm going to run user studies to prove it scientifically.  So I 
>> am looking forward to a stable SKOS proposal in order to have 
>> something to compare TRAPEZE's approach with.
> 
> I don't disagree with you about the usefulness of work in TRAPEZE. I'm 
> merely re-iterating that this is not the only possible use for DPV. And 
> that basing DPV's design only on how beneficial it is in profile 
> expression or checking (a la TRAPEZE) affects other uses.
> 
> Here are two of my publications (there are other works using DPV, but 
> these reflect personal experiences):
> 
> 1) "ODRL Profile for Expressing Consent through Granular Access Control 
> Policies in Solid" 
> https://harshp.com/research/publications/048-odrl-profile-consent-solid-acp
> 
> 2) "A Common Semantic Model of the GDPR Register of Processing 
> Activities" 
> https://harshp.com/research/publications/037-common-semantic-model-GDPR-ROPA 
> 
> 
> Both don't require 'profiles' or 'OWL2' the way TRAPEZE does. #1 needs 
> compatibility with ODRL, and #2 needs assertions about property 
> domain/ranges. But the current design of DPV as being this abstract 
> amalgamation of concepts and properties with no actual usage guidelines 
> affects both of these and introduces a *lot* of considerations that are 
> not related to the work, but instead towards figuring out what it means 
> to have DPV semantics used in these applications. This is frustrating.
> 
> If we pretend DPV was not a semantic-web vocabulary, but a list of 
> concepts (a taxonomy), then the current OWL2 design would not be the 
> right choice for either of these.
> 
>>
>> I *strongly* agree with you about the importance of adoption. We only 
>> disagree (partially) on what may foster or hinder adoption.  While we 
>> both agree that some kind of rdf links may help, we disagree on how 
>> this can be optimally implemented.
> 
> Yes. I think it all comes down to how to get most of this done with the 
> limited time people have. A lot of what you are suggesting requires 
> people who are experts in i) OWL ii) reasoning iii) writing 
> specifications and profile checkers. We don't have those. So unless you 
> or colleagues are willing to expend time in doing these, I don't see how 
> these can be done.
> 
> I would really like to have a 'formal' and 'fully formed' specification 
> like you mean in terms of OWL profiles. But the area of its application 
> is so vast, and the amount of time people are willing to spend on this 
> so less, that I don't think its feasible to have it done with all the 
> documentation and examples and real-world concepts that we plan to have 
> in the next couple of months.
> 
> Regards,
Received on Monday, 31 January 2022 17:08:22 UTC