- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Sun, 30 Jan 2022 20:02:00 +0000
- To: "public-dpvcg@w3.org" <public-dpvcg@w3.org>
Forgot to send the below email to the mailing list address. -------- Forwarded Message -------- Subject: Re: Feedback on DPV Primer's draft Date: Sun, 30 Jan 2022 20:01:00 +0000 From: Harshvardhan J. Pandit <me@harshp.com> To: Piero Bonatti <pieroandrea.bonatti@unina.it> Hi Piero, All. I understand and agree with your (Piero's) points. However there are a few things to clarify wrt what DPV should mean. My replies are inline. I may have missed/mis-interpreted some things. If so, please let me know. tldr; I think your argument (which I agree with) is based on an adopter using DPV only as an OWL2 vocabulary. I think this presents a huge barrier for having someone 'understand' and 'use' DPV as most real-world uses are not semantic web aware. It also puts a lot of burden when it comes to provide good documentation and examples, which is completely lacking at the moment because most active participants are not semantic-web people. Hence I'm trying to 'simplify' understanding of DPV concepts and how to use them by proposing use of SKOS instead of OWL as the default iteration. This is based on several meetings/calls with people who were interested in DPV but had trouble 'following RDF & OWL2' usage. On 30/01/2022 17:47, Piero Bonatti wrote: > On 28/01/22 10:37, Harshvardhan J. Pandit wrote: >> I agree with your argument regarding the class/instance modelling. >> This is also one of the reasons why I suggested SKOS would be a better >> fit for most use-cases, since it makes it possible to always have >> further expansions. With OWL, there is no possibility to define a >> purpose and to later expand it like you describe. Which puts the onus >> on the modeller to be sure about their concepts or risk changing >> models with time. > > > Such onus can be removed. Actually, in TRAPEZE, the guidelines for > extending DPV (if necessary) are very simple: always add a new class, > unless the new term being added represents a specific organization (eg > one controller, one recipient), a specific location (i.e. a GPS > coordinate), or other single data values (eg one specific email address, > or one specific ID - out of the many a person may have). In case of > doubt, make the new term a class. > > This rule of thumb is going to avoid most semantic issues and make the > modeller's life very easy; moreover, it yields cleaner and more uniform > extensions of the base vocabularies, therefore we believe that it is > also going to improve interoperability. > > Thus, in my opinion, it would be advisable to give the same suggestions > in the primer. I agree with this suggestion for when DPV is used as an OWL2 vocabulary. However, there is still the issue of property domain/range assertions. Even with punning, we get weird semantics, such as: dpv:hasPersonalData rdfs:range dpv:PersonalData . :PDH hasPersonalData dpv:Location . # class -> instance :MyGDS a dpv:Location . :PDH hasPersonalData :MyGPS . # instance While this is perfectly fine with punning, it makes it necessary for someone using DPV to understand the mechanics of OWL2 - which is a big ask IMO! > > Second, we should not forget that we are now considering SKOS and its > complications not because it is "ontologically" important to mix classes > and instances. The natural semantics of DPV concepts is clearly that of > a class. We are considering SKOS because some applications want to use > RDF no matter what, and in this rather unexpressive language, property > values can only be instances. It is not about the meaning of terms, or > knowledge representation, it is only about circumventing RDF's limitations. Actually, (IMHO), we're considering SKOS because its the closest 'simple model' that someone who doesn't want RDF can still use and get something inherently intuitive when using e.g. JSON-LD. It we follow the SKOS patterns, they are simpler to grasp and easier to implement compared to the complexities possible with OWL2. Either that, or the solution would have to be another language created solely to express the required interpretation which you've done in TRAPEZE. I am hoping using the SKOS model permits DPV to be used much like what schema.org has done for semantics i.e. encourage usage without making it necessary to first read about RDF (or OWL), but still keeping such usage (roughly) compatible. That's actually my personal summary of DPV: to be the schema.org for data protection / privacy information. If DPV was to be (only) used as a policy language or within semantic reasoners, then I agree that the OWL2 semantics would have been much better to enforce a strict(-er) interpretation. However, DPV has more applications beyond semantic web, for e.g. as a vocabulary that can be used to annotate all sorts of things (text, policies, software); and as a simple language for interoperable communications (e.g. consent requests or ROPAs). And yes, this can still be done with OWL2, but this creates a very steep adoption curve. > > One could have solved this problem simply by duplicating DPV's > properties, giving an "object level" version usable in OWL2 policies, > and a corresponding "metalevel version" usable in RDF policies, so as > to avoid the paradoxes discussed time ago by giving different ranges to > the two versions. With this approach, it would also be possible to > define a clean and coherent formal semantics for all policies (OWL and > RDF). This is indeed the proposal, i.e. to have the SKOS and OWL be under separate namespaces so one has to explicitly choose the OWL2 semantics in their data. > > SKOS avoids such duplication, but the price to be payed is that the > semantic issues related to the confusion between classes and instances > are still under the carpet. Syntactically, instances can be refined > using "narrower", "broader", and related match relations, but this is > possible only because these relations have no formal meaning (they can > be any relations). The downside is that it is not clear what policies > mean (a reliability and interoperability issue), and it is impossible to > prove that the compliance checking algorithms return no false positives > or negatives. Yes, this is again by intention (mine). Not all possible uses of DPV may need such strict 'policy' like interpretations. What DPV gains when using SKOS is simplicity and interoperability (e.g. between concepts across two data graphs). What it loses is semantics (class vs instance) and easy access to reasoning. > > In particular, concerning interoperability, we should not forget that > all compliant OWL2 reasoners must treat a given OWL2 ontology in the > same way, while each application may treat a SKOS ontology more or less > as it pleases (because "narrow" etc. do not have a semantics). Yes, this is precisely why SKOS is a better option than OWL for most use-cases, unless one *knows they want OWL2*. I'm optimising for maximising adoption of DPV rather than semantic web reasoning here ;-) > > One should also consider the additional burden in mastering SKOS (due to > its additional meta-concepts, and its many different but partially > related relations...). Compared with SKOS, the OWL2 profile adopted by > TRAPEZE (OWL2-PL) is much simpler, with only 2 kinds of relations > (SubclassOf and instanceOf) and no boolean operators nor quantifiers. I disagree that OWL2 or TRAPEZE's profile is 'simpler' than SKOS. Both (OWL2 ones) have a lot of complexity hidden away behind the possibility to use all sorts of complex OWL2 stuff. Even if we have 'guidelines', the moment we say follow OWL2 - then the adopter is free to use any and all of OWL2 semantics. This makes ensuring interoperability or even a simple guideline to provide a very complicated and difficult task. This means we'll need to write a 'formal specification' for what DPV (in OWL) should or should not contain, and keep it updated as concepts are added. That's a LOT of work, almost a H2020 project :-D By contrast, the SKOS model's semantics are so simple and abstract, that they minimise the possibilities for someone using DPV in some weird and non-compatible way. There are only two relations narrow/broad to express, and no meta-modelling to worry about since everything is an instance (a skos:Concept). This makes it trivial for someone to take a DPV hierarchy and use it however they want - whether just as a list of concepts, or plug it into their vocabulary, or even map it to OWL2 interpretations. In many of the calls I've had in the past two years based on someone reaching out because they saw DPV and thought it was interesting, I've had trouble getting them (usually its an industry person) understand the semantics of DPV. They understand the basics (classes and subclasses) but get really confused when we get to instances and OWL2 logic. Then there were 'complaints' that the OWL2 interpretation prevented tooling from properly using DPV because it blew up when presented with punning. And finally there were discussions on how to use DPV "just like JSON" i.e. they didn't care about semantic web, but wanted DPV basics. So the goal here is to satisfy such requirements and to get DPV to be actually used in more places. Its easier to 'sell' a complex semantics and reasoner tooling that does cool stuff like check compliance if someone is 'already using the vocabulary'. But its really difficult to convince someone to use DPV if figuring out how to integrate it in their stuff is a challenge. All this being said, it is my wish that whatever DPV ends up being should be backwards compatible with SPECIAL and TRAPEZE i.e. as OWL vocabularies. Hence the parallel SKOS & OWL versions proposal. Hope this makes it clearer on why I'm pushing for SKOS while advocating for OWL at the same time. Regards, -- --- Harshvardhan J. Pandit, Ph.D Research Fellow ADAPT Centre, Trinity College Dublin https://harshp.com/
Received on Sunday, 30 January 2022 20:02:19 UTC