Re: Global property constraints from Dimitris Kontokostas on 2014-12-05 (public-data-shapes-wg@w3.org from December 2014)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 5 Dec 2014 11:19:10 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a0Lh685ta5kRRZX9JbEQC8h=RzZkoq=vRvJqRXoOVpYhg@mail.gmail.com>
It's great that you see the need for that as well. To make this work better
we'd need an extra *syntactic sugar* to support it.
SPARQL (if supported) will of course be an option for conversion but it
would be easier for people to work with similar concepts / properties.

I see the focus of most threads on instance/subject-based validation so I
thought I'd raise this concern early.

Best,
Dimitris

On Fri, Dec 5, 2014 at 10:54 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

>  Ok, we seem to be on the same page here. I absolutely agree that some
> backwards compatibility with existing OWL/RDFS models is important. And I
> am confident we can deliver that without making a new language *depend* on
> OWL in any way. We can handle existing rdfs:range/domain statements using
> interpretation approaches like global constraints, or constraints attached
> to the properties. This can be handled by a compatibility adapter that may
> be its own deliverable.
>
> Once a shapes language has been published, and people in certain domains
> like it, we will hopefully witness more models that use shapes directly on
> the public web too. Both languages can and will co-exist. With SPIN it is
> common that people add constraints to existing OWL models, to cover the
> closed world scenarios and expressivity that is not well handled by OWL.
> The main reason why many people use (or mis-use) OWL and domain/ranges for
> remaining aspects is that there is no widely accepted alternative (yet).
>
> Holger
>
>
>
> On 12/5/14, 5:52 PM, Dimitris Kontokostas wrote:
>
> Hi Holger,
>
>  I +1 your modeling approach but this is not about "proper" modeling. For
> instance, I don't understand why the (bio)medical domain chooses different
> datatypes in a property per context but I am sure they have their reasons.
>
>  My rational for this suggestion is from the research-based LOD
> publishing where people use ontologies as a guide. In almost all of the use
> cases where RDFUnit was used it helped maintainers find violations of
> owl/rdfs axioms in cwa which they found it very useful and helped them
> improve their datasets. We provide additional manual constraints for some
> ontologies that are preloaded in the tool and provide more fine-grained
> validation. However, in only few cases people invested time to create their
> own custom constraints.
> Defining constraints is a very time consuming job and things might be
> easier in industry where someone is paid to do that.
>
>  Existing RDFS/OWL axioms can provide a basiline for quality for most LD
> datasets and it is very easy to support this by redefining (some of) these
> axioms with explicit semantics.
> 1 year ago, when LOV was smaller and RDFUnit did not support so many
> axioms, we generated 32K constraints for all LOV ontologies / vocabularies
> [1], now the number should much higher.
>
>  The way I think about this is:
> 1) a validation engine can use an axiom directly where it will know how to
> interpret it
> 2) People can automatically generate constraints from axioms and then
> manually curate them.
>
>  Both approaches increase the adoption of our deliverables with no or
> minimal effort from the maintainers. Of course something like this should
> be optional or overridable from the validation engine to support all cases.
>
>  Best,
> Dimitris
>
>  [1] http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf
>
> On Fri, Dec 5, 2014 at 1:09 AM, Holger Knublauch <holger@topquadrant.com>
> wrote:
>
>>  On 12/5/2014 0:12, Dimitris Kontokostas wrote:
>>
>>  I am not against defining a per shape domain and range but IMHO global
>> property constraints should also be possible. If and to what extend the WG
>> will decide.
>> SPIN also allows this by attaching constraints to owl:Thing /
>> rdf:Property so global constraints are not a bad think per se.
>>
>>  I believe without them, duplication will be needed for the constraint
>> definition and duplication leads to confusion and increases maintenance
>> cost.
>> For example:
>> - foaf:mbox should always be a string  (datatype range), match an email
>> regex pattern and be unique in my RDF graph, perhaps I don't care about the
>> class it is used in
>>   - Later on I may define a class shape W3CPerson that only requires a
>> regex in the form "*@w3.org" and a FooPerson with "*@foo.bar" <*@foo.bar>
>> (both constraints should be validated, the global one and the local one)
>> - foaf:name should always be used in a foaf:Person (domain). Violation
>> occurs when foaf:name is found in any other class
>>
>>  Closed shapes, oslc:Property sub-classing or using owl:Thing as a class
>> can be workarounds to this but should we tackle only high level constraints
>> in terms of Shapes/Classes or should we allow constraints on the property
>> level as well?
>>
>>
>> Ok, so you see global property constraints as a useful shorthand syntax,
>> "syntactic sugar" to avoid having to repeat the same statements over and
>> over again. I can certainly agree that this is desirable.
>>
>> But let's look at specific examples. You mention foaf:mbox, let me allow
>> to use schema:email for now.
>>
>>     http://schema.org/email
>>
>> Its domain includes ContactPoint, Organization and Person.
>>
>> Solution 1: Yes, we could introduce global equivalents of all local
>> restrictions, e.g.
>>
>>     schema:email
>>         a rdf:Property ;
>>         :valueType xsd:string ;
>>         :pattern "someRegex" ;
>>         :domainIncludes schema:Person ;
>>
>>     schema:Person
>>         a rdfs:Class .
>>
>>     schema:W3CPerson
>>         a rdfs:Class ;
>>         rdfs:subClassOf schema:Person ;
>>         :property [
>>             :predicate schema:email ;
>>             :pattern "W3Cregex" ;
>>         ] ;
>>
>> Solution 2: Alternatively, let's do what most OO systems do, and
>> introduce a "-able" class:
>>
>>     schema:Emailable
>>         a rdfs:Class ;
>>         :property [
>>             :predicate schema:email ;
>>             :valueType xsd:string ;
>>             :pattern "someRegex" ;
>>         ] ;
>>
>>     schema:Person
>>         a rdfs:Class ;
>>         rdfs:subClassOf schema:Emailable .   # = domainIncludes
>>
>>     schema:W3CPerson
>>         a rdfs:Class ;
>>         rdfs:subClassOf schema:Person ;
>>         :property [
>>             :predicate schema:email ;
>>             :pattern "W3Cregex" ;
>>         ] .
>>
>> which inherits the property and its constraints. If you compare those two
>> approaches you will see that Option 2 only requires a single, consistent
>> mechanism for resolving constraints: walking up the class hierarchy, while
>> Option 1 would require looking at two different things - local and global
>> constraints. These two mechanisms would have to implemented, explained and
>> understood.
>>
>> Equally important: no other mainstream technology uses global properties
>> - OO and XML have local attributes, fields, properties and that's what most
>> people are familiar with. This also means its easier to move back and forth
>> between those worlds.
>>
>> In which cases would solution 2 not work?
>>
>> Holger
>>
>>
>
>
>  --
>  Dimitris Kontokostas
> Department of Computer Science, University of Leipzig
> Research Group: http://aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Friday, 5 December 2014 09:20:05 UTC