- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Mon, 31 Jan 2022 11:36:24 +0000
- To: Piero Bonatti <pieroandrea.bonatti@unina.it>
- Cc: "public-dpvcg@w3.org" <public-dpvcg@w3.org>
Hi Piero. Thanks for your (also) prompt reply! My comments are inline. btw; this email was not sent to the mailing list, but I've added it. I hope you don't mind. tldr; We need to sort of agree on a compromise here, otherwise we both could keep discussing this for a long time. I, as the chair, have to make a decision this week or next. I, as the only person who's writing documentation, also needs to make the decision to write examples and use-cases. This also needs to be done this week or next. Neither of this are by my choice - no one else has volunteered. So I won't delay these any further since my aim is to get DPV to v1 this year. I, personally, don't have the time to do the kind of OWL profile specification and validation/profile-checker that you are indicating. I know TRAPEZE is doing this. So unless you or TRAPEZE colleagues can do this right away, I don't see how it can be done in time. On 31/01/2022 10:03, Piero Bonatti wrote: > > - The use of RDFS/JSON-LD is motivated with the annotation use case. In > this context, I said that - as an alternative to SKOS - one might have > used a meta-version of DPV properties in order to link a resource to a > DPV class in RDFS/JSON-LD. > For example, using the duplication approach, hasPersonalData would > have a corresponding meta-property hasPersonalDataClass that ranges over > a (meta)class "PersonalDataClasses", that in turn contains (as > instances) PersonalData and its subclasses. You can define such > metaclasses with RDFS and JSON-LD. Now, for example, given a resource > R, you can annotate it with - say: > "R hasPersonaDataClass Location" > to assert that R contains location data. > > The range of such meta-level properties are meta-classes (like > PersonalDataClasses) whose instances are DPV concepts. This resolves > contradiction and ambiguities by keeping object-level and meta-level > cleanly separate. I don't seem that this is the same as using SKOS, as > in SKOS the two layers (object-level and meta-level) are not cleanly > separated, nor precisely related. I really think this is 'ugly' to have separate properties like that. I haven't seen this done anywhere else either. This also requires explaining what is a 'class' and an 'instance', and its nuances, and what happens if these are mixed - to the adopter. Further, there is a really good chance that what one graph may think of as an instance, another one may want it to be a class. For example, Country as instance in one graph (controller location), and class in another (data storage locations for servers). We can create 'guidelines', but I think we'll be asking a lot from users if they need to re-evaluate their entire data model every time a concept is added/changes. > > - The "vague" semantics of SKOS affects interoperability not only with > respect to the use cases that involve reasoning, but with respect to all > use cases. What is vague remains vague, no matter how it is used. The > example of reasoners treating a same SKOS graph in different ways can be > easily generalized to other applications. With OWL and SKOS, we get the same interoperability in case of DPV. Both are sufficient to express whether a concept is "part of" another concept (as a set). That's all that is needed at the moment IMHO. This is considering (and I posit) that not all use-cases would even involve reasoning beyond this. For example, when I work with ROPA (register of processing activitiy) or Consent requests, I need a vocabulary to represent the terms, and they may not end up being semantic-web in the end. The only inference/reasoning involved is as a hierarchy. The spectrum for DPV's possible application ranges from annotate/tag to doing logic-based compliance checking (e.g. DAPRECO KG). It is impossible to support all of it. Instead, I'm trying to make it as simple as possible to reuse/convert DPV to support any of these. > > - I have to insist that OWL2-PL's knowledge bases/graphs are indeed > simpler than SKOS graphs because the former contain statements like: > * term A is a subclass of term B > * term A is an instance of term B > * the domain/range of property P is term A > * classes A and B have no instances in common. > Moreover, a user who wants to add a new personal data category or a new > purpose needs almost only the subclass statement (if TRAPEZE's > guidelines are followed). Understanding and extending SKOS assertions > needs more work. I have to disagree with this. SKOS assertions are simpler than OWL or TRAPEZE ones. With SKOS, we have: term T(op level concept) is an instance of skos:ConceptScheme and owl:Class . term A is an instance of term T term A is broader/narrower than some other concept (if needed) domain/range of property P is term T This allows for things like: :MyEmail a dpv:PersonalData . :MyEmailUsagePatterns a dpv:PersonalData ; skos:broader :MyEmail . :X dpv:hasPersonalData :MyEmail, :MyEmailUsagePatterns . If someone wants to extend this, they just need the broader/narrower relations. :MyEmailUsagePatternsOnSaturday a dpv:PersonalData ; skos:broader :MyEmailUsagePatterns. In OWL, one has to do it like this: :MyEmail rdfs:subClassOf dpv:PersonalData ; a dpv:PersonalData . :MyEmailUsagePatterns a :MyEmail . :X dpv:hasPersonalData :MyEmail, :MyEmailUsagePatterns . Then to extend it, one needs to assert MyEmailUsagePatterns as a class now. :MyEmailUsagePatterns rdfs:subClassOf :MyEmail . :MyEmailUsagePatternsOnSaturday a :MyEmailUsagePatterns . So the OWL one constantly asks someone to evaluate between classes/instances whereas the SKOS one doesn't. One could say use only sub-classes and no instances, but then why pretend this is different from the SKOS model? They are both doing the same, expressing parent-child style relationships. Additionally with OWL, you have to explain when to use subclasses or why not to use instances. Because one could 'not follow' the guidelines. Whereas with SKOS, all you need to say is: find a concept in the hierarchy closest to the one you have, extend it using skos:broader. That's it. Nothing else to think about. And there's no other way to do this which makes not following "guidelines" more difficult. And with SKOS, we get better 'vocabulary' management because I can create ad-hoc taxonomies that work with both SKOS and OWL, and that maps really simply with anything else that requries a taxonomy. For example, creating 'Data Transfer Legal Basis' as a separate taxonomy is as trivial as declaring a concept scheme and throwing all related legal bases under it. With OWL, one has to craft sub-classes for new concepts, and re-arrange the entire legal bases taxonomy when one concept is added. This means DPV will fluctuate a lot between versions. Not desirable IMO. > > - Profiles are well-defined and checked syntactically, so your statement > "the moment we say follow OWL2 - then the adopter is free to use any and > all of OWL2 semantics" is ungrounded. Only the assertions supported by > the profile will be accepted, the others shall be treated like syntax > errors. [please note that this is a fully standard approach: the OWL > APIs themselves support profile definition and parsing]. Okay. First, this necessitates creating a separate profile. Who will create these profile specifications and linters/checkers for their expression? Then these need to be kept up to date as things change. There is documentation to be created, use-cases to be written. It is a lot of work! Right now in DPVCG we don't even have a lot of involvement from people for discussing and refining the concepts. Its just me working on a lot of stuff, and this is after-hours work, not even as part of my regular job. So I consider working on more specifications and profile checkers as nice to have, but not a priority if there are no person months available to get them done. > > - In the light of the above point, TRAPEZE's guidelines are actually and > effectively going to remove all complications. Maybe for TRAPEZE use-cases. But I don't think they work with other use-cases where the same requirements are not present. As in SPECIAL, TRAPEZE has a strict set of aims for what it wants to do with DPV. But there are others who also want to use DPV, and I re-iterate that this does not always involve the kind of profile checking you do within SPECIAL/TRAPEZE. > > I am so confident about the greater usability of TRAPEZE's framework > that I'm going to run user studies to prove it scientifically. So I am > looking forward to a stable SKOS proposal in order to have something to > compare TRAPEZE's approach with. I don't disagree with you about the usefulness of work in TRAPEZE. I'm merely re-iterating that this is not the only possible use for DPV. And that basing DPV's design only on how beneficial it is in profile expression or checking (a la TRAPEZE) affects other uses. Here are two of my publications (there are other works using DPV, but these reflect personal experiences): 1) "ODRL Profile for Expressing Consent through Granular Access Control Policies in Solid" https://harshp.com/research/publications/048-odrl-profile-consent-solid-acp 2) "A Common Semantic Model of the GDPR Register of Processing Activities" https://harshp.com/research/publications/037-common-semantic-model-GDPR-ROPA Both don't require 'profiles' or 'OWL2' the way TRAPEZE does. #1 needs compatibility with ODRL, and #2 needs assertions about property domain/ranges. But the current design of DPV as being this abstract amalgamation of concepts and properties with no actual usage guidelines affects both of these and introduces a *lot* of considerations that are not related to the work, but instead towards figuring out what it means to have DPV semantics used in these applications. This is frustrating. If we pretend DPV was not a semantic-web vocabulary, but a list of concepts (a taxonomy), then the current OWL2 design would not be the right choice for either of these. > > I *strongly* agree with you about the importance of adoption. We only > disagree (partially) on what may foster or hinder adoption. While we > both agree that some kind of rdf links may help, we disagree on how this > can be optimally implemented. Yes. I think it all comes down to how to get most of this done with the limited time people have. A lot of what you are suggesting requires people who are experts in i) OWL ii) reasoning iii) writing specifications and profile checkers. We don't have those. So unless you or colleagues are willing to expend time in doing these, I don't see how these can be done. I would really like to have a 'formal' and 'fully formed' specification like you mean in terms of OWL profiles. But the area of its application is so vast, and the amount of time people are willing to spend on this so less, that I don't think its feasible to have it done with all the documentation and examples and real-world concepts that we plan to have in the next couple of months. Regards, -- --- Harshvardhan J. Pandit, Ph.D Research Fellow ADAPT Centre, Trinity College Dublin https://harshp.com/
Received on Monday, 31 January 2022 11:36:42 UTC