Re: Global property constraints from Dimitris Kontokostas on 2014-12-05 (public-data-shapes-wg@w3.org from December 2014)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 5 Dec 2014 09:52:31 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a3z_JSOQ7rje3-h-fLphpXB3TvVyfVbdeeyf19u3-hDBg@mail.gmail.com>
Hi Holger,

I +1 your modeling approach but this is not about "proper" modeling. For
instance, I don't understand why the (bio)medical domain chooses different
datatypes in a property per context but I am sure they have their reasons.

My rational for this suggestion is from the research-based LOD publishing
where people use ontologies as a guide. In almost all of the use cases
where RDFUnit was used it helped maintainers find violations of owl/rdfs
axioms in cwa which they found it very useful and helped them improve their
datasets. We provide additional manual constraints for some ontologies that
are preloaded in the tool and provide more fine-grained validation.
However, in only few cases people invested time to create their own custom
constraints.
Defining constraints is a very time consuming job and things might be
easier in industry where someone is paid to do that.

Existing RDFS/OWL axioms can provide a basiline for quality for most LD
datasets and it is very easy to support this by redefining (some of) these
axioms with explicit semantics.
1 year ago, when LOV was smaller and RDFUnit did not support so many
axioms, we generated 32K constraints for all LOV ontologies / vocabularies
[1], now the number should much higher.

The way I think about this is:
1) a validation engine can use an axiom directly where it will know how to
interpret it
2) People can automatically generate constraints from axioms and then
manually curate them.

Both approaches increase the adoption of our deliverables with no or
minimal effort from the maintainers. Of course something like this should
be optional or overridable from the validation engine to support all cases.

Best,
Dimitris

[1] http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf

On Fri, Dec 5, 2014 at 1:09 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

>  On 12/5/2014 0:12, Dimitris Kontokostas wrote:
>
>  I am not against defining a per shape domain and range but IMHO global
> property constraints should also be possible. If and to what extend the WG
> will decide.
> SPIN also allows this by attaching constraints to owl:Thing / rdf:Property
> so global constraints are not a bad think per se.
>
>  I believe without them, duplication will be needed for the constraint
> definition and duplication leads to confusion and increases maintenance
> cost.
> For example:
> - foaf:mbox should always be a string  (datatype range), match an email
> regex pattern and be unique in my RDF graph, perhaps I don't care about the
> class it is used in
>   - Later on I may define a class shape W3CPerson that only requires a
> regex in the form "*@w3.org" and a FooPerson with "*@foo.bar" <*@foo.bar>
> (both constraints should be validated, the global one and the local one)
> - foaf:name should always be used in a foaf:Person (domain). Violation
> occurs when foaf:name is found in any other class
>
>  Closed shapes, oslc:Property sub-classing or using owl:Thing as a class
> can be workarounds to this but should we tackle only high level constraints
> in terms of Shapes/Classes or should we allow constraints on the property
> level as well?
>
>
> Ok, so you see global property constraints as a useful shorthand syntax,
> "syntactic sugar" to avoid having to repeat the same statements over and
> over again. I can certainly agree that this is desirable.
>
> But let's look at specific examples. You mention foaf:mbox, let me allow
> to use schema:email for now.
>
>     http://schema.org/email
>
> Its domain includes ContactPoint, Organization and Person.
>
> Solution 1: Yes, we could introduce global equivalents of all local
> restrictions, e.g.
>
>     schema:email
>         a rdf:Property ;
>         :valueType xsd:string ;
>         :pattern "someRegex" ;
>         :domainIncludes schema:Person ;
>
>     schema:Person
>         a rdfs:Class .
>
>     schema:W3CPerson
>         a rdfs:Class ;
>         rdfs:subClassOf schema:Person ;
>         :property [
>             :predicate schema:email ;
>             :pattern "W3Cregex" ;
>         ] ;
>
> Solution 2: Alternatively, let's do what most OO systems do, and introduce
> a "-able" class:
>
>     schema:Emailable
>         a rdfs:Class ;
>         :property [
>             :predicate schema:email ;
>             :valueType xsd:string ;
>             :pattern "someRegex" ;
>         ] ;
>
>     schema:Person
>         a rdfs:Class ;
>         rdfs:subClassOf schema:Emailable .   # = domainIncludes
>
>     schema:W3CPerson
>         a rdfs:Class ;
>         rdfs:subClassOf schema:Person ;
>         :property [
>             :predicate schema:email ;
>             :pattern "W3Cregex" ;
>         ] .
>
> which inherits the property and its constraints. If you compare those two
> approaches you will see that Option 2 only requires a single, consistent
> mechanism for resolving constraints: walking up the class hierarchy, while
> Option 1 would require looking at two different things - local and global
> constraints. These two mechanisms would have to implemented, explained and
> understood.
>
> Equally important: no other mainstream technology uses global properties -
> OO and XML have local attributes, fields, properties and that's what most
> people are familiar with. This also means its easier to move back and forth
> between those worlds.
>
> In which cases would solution 2 not work?
>
> Holger
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Received on Friday, 5 December 2014 07:53:27 UTC