- From: Harshvardhan J. Pandit <me@harshp.com>
- Date: Tue, 18 Jan 2022 15:59:44 +0000
- To: "Hoekstra, Rinke (ELS-AMS)" <r.hoekstra@elsevier.com>, "public-dpvcg@w3.org" <public-dpvcg@w3.org>
Hi Rinke, All. Thanks for bringing this up. Its timely and important to drive this discussion to a conclusion given that we're moving towards having examples in documentation. My replies are inline. tldr; I agree that SKOS is a much better and consistent model for expressing what DPV wants to provide. For OWL or RDFS, a separate file can provide alternate serialisations. The answer to why things are like this is 'technical debt'. We started with ontologies developed as part of SPECIAL project, which were in OWL2 and used classes to define policies. Moving forward, we're trying to broaden the applicability, which is IMHO what SKOS is best suited to do. So to conclude, there's a proposal on the table to move to SKOS, I support/lead it, and we will also provide RDFS and OWL separately as alternatives to keep existing adopters/users happy. Please share your thoughts, support, alternatives here on the mailing list or on the GitHub issue (https://github.com/w3c/dpv/issues/8). On 18/01/2022 15:12, Hoekstra, Rinke (ELS-AMS) wrote: > One of the things that puzzles me about the vocabulary is the choice for > using RDFS classes and RDF properties for representing the vocabulary, > and in particular on the different categories of personal data. > > First of all, why choose rdfs:Class and rdf:Property vs owl:Class and > owl:ObjectProperty/owl:DatatypeProperty. The latter give you more > finegrained control over what the intended/expected range for the > properties are. There is one good reason not to differentiate, and that > is that you don’t want to impose a specific way of modeling the data. > This comes at a cost of reusability. Because we don't forsee such strict limitations on what the domain/range of those properties should be. They're free-form because what may be an object in someone's use-case could be a datatype/literal in someone else's. See example below. > > Secondly, I do not understand the choice to model all of the categories > as classes. What are the intended instances of these classes? This is tricky to answer and to explain. In 'real-world', there may never be instances. For example, a policy operating only on 'data categories' would have only 'classes'. Sure we could argue such categories should be represented as instances, but in OWL instances are kind of final in that you cannot further expand them within the same taxonomy (i.e. subclasses). So we want a way to do all three of the following: ex:A dpv:hasPersonalData dpv:EmailAddress . ex:A dpv:hasPersonalData ex:MyEmailAddress . ex:A dpv:hasPersonalData "myemail@example.com" . So the range of this property becomes classes AND instances, which is weird under OWL unless you do convoluted expressions stating a union of subclasses and instances, which even then won't be complete. The third example having a literal is the problematic one. Blank nodes will be inevitably created if trying to do a mapping or alignment between e.g. from database to RDF when range of property is an instance. So we can "suggest" never to use literals and to pack literals into arbitrary instances - which would make many people unhappy because that's how they specify their data. Using SKOS, it gets a little easier, because the range is now an instance of one concept, and even if it still can't specify literals, it can arbitrarily specify what would have been classes and instances in OWL. Example: ex:A rdfs:subClassOf ex:B . ex:A skos:broader ex:B . ex:M a ex:N . ex:M skos:broader ex:N . > > I can see a discussion related to this topic took place at the Nov 2020 > meeting [2], but the outcome seemed to be more around removing > domain/range restrictions so that the solution around the issue above, > as proposed by Victor (:wave:) in e.g. [3] gets hidden under the carpet > (Victor suggested that the range of e.g. dpv:hasProcessing is a blank > node that is an instance of dpv:Collect). Yes, that’s ugly [4], and I > agree with Rob’s suggestion here to use SKOS or instances and enumerated > classes. I think Harsh also supports this in his emails [5]. > > The arguments against this appear to be around inferencing, but I don’t > see what inferencing task is served by modeling these categories as classes. That was a band-aid solution so that the vocabulary can be used while we 'discuss' a better way to go ahead (re. SKOS). I do support SKOS for precisely these reasons. Though even using SKOS is not straightforward, so there has to be some discussion on the exact mechanics of what concepts to use from SKOS. See https://github.com/w3c/dpv/issues/8 for using SKOS. See https://harshp.com/dpv-x/primer/#classes-hierarchies-and-instances for text about semantics and extensibility DPV must provide. Regards, -- --- Harshvardhan J. Pandit, Ph.D Research Fellow ADAPT Centre, Trinity College Dublin https://harshp.com/
Received on Tuesday, 18 January 2022 16:00:00 UTC