Re: Feedback on DPV Primer's draft

Thank you for your reply, Harsh. A few more observations inline

On 28/01/22 10:37, Harshvardhan J. Pandit wrote:
> I agree with your argument regarding the class/instance modelling. This 
> is also one of the reasons why I suggested SKOS would be a better fit 
> for most use-cases, since it makes it possible to always have further 
> expansions. With OWL, there is no possibility to define a purpose and to 
> later expand it like you describe. Which puts the onus on the modeller 
> to be sure about their concepts or risk changing models with time.


Such onus can be removed. Actually, in TRAPEZE, the guidelines for 
extending DPV (if necessary) are very simple: always add a new class, 
unless the new term being added represents a specific organization (eg 
one controller, one recipient), a specific location (i.e. a GPS 
coordinate), or other single data values (eg one specific email address, 
or one specific ID - out of the many a person may have).  In case of 
doubt, make the new term a class.

This rule of thumb is going to avoid most semantic issues and make the 
modeller's life very easy; moreover, it yields cleaner and more uniform 
extensions of the base vocabularies, therefore we believe that it is 
also going to improve interoperability.

Thus, in my opinion, it would be advisable to give the same suggestions 
in the primer.

Second, we should not forget that we are now considering SKOS and its 
complications not because it is "ontologically" important to mix classes 
and instances.  The natural semantics of DPV concepts is clearly that of 
a class. We are considering SKOS because some applications want to use 
RDF no matter what, and in this rather unexpressive language, property 
values can only be instances.  It is not about the meaning of terms, or 
knowledge representation, it is only about circumventing RDF's limitations.

One could have solved this problem simply by duplicating DPV's 
properties, giving an "object level" version usable in OWL2 policies, 
and a corresponding "metalevel version"  usable in RDF policies, so as 
to avoid the paradoxes discussed time ago by giving different ranges to 
the two versions. With this approach, it would also be possible to 
define a clean and coherent formal semantics for all policies (OWL and RDF).

SKOS avoids such duplication, but the price to be payed is that the 
semantic issues related to the confusion between classes and instances 
are still under the carpet.  Syntactically, instances can be refined 
using "narrower", "broader", and related match relations, but this is 
possible only because these relations have no formal meaning (they can 
be any relations).  The downside is that it is not clear what policies 
mean (a reliability and interoperability issue), and it is impossible to 
prove that the compliance checking algorithms return no false positives 
or negatives.

In particular, concerning interoperability, we should not forget that 
all compliant OWL2 reasoners must treat a given OWL2 ontology in the 
same way, while each application may treat a SKOS ontology more or less 
as it pleases (because "narrow" etc. do not have a semantics).

One should also consider the additional burden in mastering SKOS (due to 
its additional meta-concepts, and its many different but partially 
related relations...).  Compared with SKOS, the OWL2 profile adopted by 
TRAPEZE (OWL2-PL) is much simpler, with only 2 kinds of relations 
(SubclassOf and instanceOf) and no boolean operators nor quantifiers.

Best,

Piero

> 
> I think we can add a note to this effect in the Primer, as you suggest, 
> to make it clear by further explaining that when defining concepts, one 
> should try to forsee possible expansions in the future. So for your 
> example, modelling SeasonalOffers as a class and Christmas2021 as an 
> instance of it.
> 
> I was thinking for examples, the Primer can explain with multiple 
> options so one could see the differences better. For this, options as: 
> modelling with SKOS or OWL ; and serialisations in Turtle and JSON-LD. 
> Similar to other sem-web documents (e.g. 
> https://www.w3.org/TR/owl2-primer/#OWL_Syntaxes), there can be a button 
> at the top to toggle chosen option on/off.
> 
> So in the above example, Christmas2021 could be further expanded into 
> Christmas2021NewSubscribers and Christmas2021ExistingSubscribers, 
> whereas in OWL, it would necessitate having Christmas2021 as a class 
> first. This would help people who are not familiar with sem-web to 
> understand the implications better for how they represent their model.
> 
> Regards,
> Harsh
> 
> On 27/01/2022 15:52, Piero Bonatti wrote:
>> Consider Example 2 (Extending Market concept to represent granular and 
>> accurate Purposes).  This example could still be debatable, since the 
>> term "MarketingSeasonalOffer" itself could be made even more specific 
>> - say - by restricting it to a specific marketing campaign by a single 
>> company (eg Gucci) and for a specific season (eg Christmas 2021).  In 
>> other words, MarketingSeasonalOffer still looks like a class.
> 
>>
>> Now suppose DPV is used for expressing consent in this context.  If 
>> MarketingSeasonalOffer is an instance, then it can't be refined to 
>> refer *only* to Gucci's 2021 Christmas campaign. Therefore, a data 
>> subject cannot opt in *only* for this particular marketing campaign: 
>> the options are either all MarketingSeasonalOffers hereafter (by all 
>> companies), or nothing.
>>
>> If the above example of an instance (Gucci's campaign, autumn 2021) 
>> looks too complicated, then this example may be replaced with one 
>> about the class Recipient, of which Google (or any other company) can 
>> be an instance.  In my opinion, recipients and controllers cover most 
>> of the very few cases in which you may really happen to want an 
>> instance instead of a class.
> 
>>
>> There are other examples in the primer where instances are probably 
>> better modelled as classes:  for example TVServiceOptimisation (could 
>> be restricted to specific services); CombinedNewPurpose in Example 5 
>> (could be restricted to specific kinds of good delivery or specific 
>> companies), LimitAccess in Ex. 6 (different policies determine 
>> different limitations, eg. only working hours, no weekends, no access 
>> at all), and so on....
>>
>> By the way: In the primer, it could be useful to remind somewhere that 
>> if a term is an instance, then it can't be further 
>> refined/subclassed/restricted by another term.  Thus, it is a sort of 
>> point-of-no-return.  Use instances with care!  Instances jeopardize 
>> extensibility and interoperability as illustrated in sections 4.2 and 
>> 4.3.
> 

Received on Sunday, 30 January 2022 17:48:12 UTC