RE: Returning to OWL and application profiles from Jon Phipps on 2010-10-12 (public-lld@w3.org from October 2010)

From: Jon Phipps <jonp@jesandco.org>
Date: Tue, 12 Oct 2010 09:38:35 -0400
To: "Mikael Nilsson" <mikael@nilsson.name>, "Antoine Isaac" <aisaac@few.vu.nl>
Cc: "public-lld" <public-lld@w3.org>
Message-ID: <2B028C2287DEBA4BA9CA380696BF40E402D82A44@BE43.exg3.exghost.com>
Hi Mikael, Antoine,

Mikael, the use case for syntactic constraints that occurs to me has more to do with treating an RDF graph as an expression of the metadata rather than the source.

It seems to me that our usual notions of data validation are turned on their head, given the inherent Open World assumption of the RDF data model and the fact that the model is more concerned with 'consistency' than validity. An AP is by definition domain-specific and is intended to communicate a domain-local understanding of the classes of resources and their properties. In that sense there would seem to be two useful metadata patterns that comprise an AP -- a set of domain-local validation patterns that looks at each statement with the XML-ish notion of pass/fail within the domain only, and a global-commitment consistency pattern that seeks to ensure that the data is both internally and globally consistent, i.e. does it 'make sense' out in the RDF-ish global web of data.

For instance it's perfectly reasonable for me to define dc:title as a property of mydomain:person and constrain its values to 'Mr., Ms., Dr.'. This clearly happens all the time, and I can document this effectively and produce a very nice schema that will invalidate a value of 'Miss'. But that same data ceases to 'make sense', given the defined semantics of dc:title, when it becomes part of the open world. The same is true of other structural and semantic constraints.

It's also useful to remember that an AP, as defined in the Singapore Framework, is primarily a set of documents that formally communicate domain-specific knowledge. The Description Set Profile is where that knowledge begins to be expressed as model-specific patterns. Personally I would very much like DC to investigate (as Alistair did) how the abstract patterns represented by the proposed DSP constraint language might translate into OWL2 ontologies for communicating domain semantics and XML Schema for domain-specific validation.

Jon Phipps

-----Original Message-----
From: public-lld-request@w3.org [mailto:public-lld-request@w3.org] On Behalf Of Mikael Nilsson
Sent: Tuesday, October 12, 2010 7:50 AM
To: Antoine Isaac
Cc: public-lld
Subject: Re: Returning to OWL and application profiles

Hi Antoine,

tis 2010-10-12 klockan 13:11 +0200 skrev Antoine Isaac:
> Hi Mikael,
> 
> Thanks for starting this interesting thread :-)

Thanks for an interesting reply :)

> 
> Well, I have what I think is a quite traditional SW background, and 
> I'd be tempted to turn the argument the other way round ;-) If the instances of one class in an AP have one title and the instances of that class in another AP have two titles, then it is perhaps that these two APs are thinking of two different classes, really. Possibly they could be two subclasses of a common superclass, but two different classes, still.
> I think this is quite in line with Jeff's message suggesting you could 
> do things like this baz:Widget rdfs:subClassOf foo:Widget.
> to make the commitment of your various APs a bit clearer.

Yes, I do believe that conceptually, most APs can be described using the trick of the creation of subclasses, even though the subclass may well consist of the same individuals (!)

> 
> I'm also a bit puzzled by your "APs define domain-specific structural constraints while OWL adds semantics to existing classes." Again, I am pretty new to the APs as practiced in the DC realm, but why wouldn't you consider classes (and APs built on top of them) from a (SW) semantic perspective?
> I understand that OWL and RDF will fail for arrangement/presentation of data (order of XML elements, e.g.), which is not really about semantics. But to me--and to many in the SW community--cardinality belongs to semantics of classes and properties.

Let me give a simple example. Let's assume I am designing a simple REST service for retrieving metadata records from a library repository (never mind the 303s etc).

I decide to return Turtle representations of the form

myrepo:book123 rdf:type lib:Book,
               dct:title "Moby Dick",
               dct:creator myrepo:author345 .


I can define this pattern using OWL restrictions on the lib:Book class.

Later on, I want to define an extended API for use by library partners.
This API returns records of the form

myrepo:book123 rdf:type lib:Book,
               dct:title "Moby Dick",
               dct:creator myrepo:author345,
               lib:numCopies "5".


This record describes the exact same individual but using a different application profile. 

To solve this with the subclass method I would need to define a subclass lib:ExtendedBook that captures the additional property, but has the same extension.

My point is that application profiles describe metadata *patterns*, independently of the semantics. None of the above properties are likely required (cardinality > 0) for the Book class in any case. And there may well be multiple such patterns for the same class or set of things.

In the DC world, the use of application profiles is often framed in this way - an application profile does not describe a class, it describes the metadata records emitted or accepted by a system (hence the term "application").

> > But that's exactly it - the semantics is "alternative" and on its 
> > face, the semantics of the published OWL file is something else entirely.
> >
> > As a concrete example, if I serve an OWL file from my web server, 
> > using application/rdf+xml as suggested by the OWL specs [1], the 
> > interpretation will be as RDF triples using the RDF and OWL built-in 
> > semantics, thus resulting in the generation of new triples, 
> > potentially contradiction with other ontologies, and not in validation as expected.
> 
> 
> Well, contradiction (aka inconsistency) is what is used for data 
> validation in the RDF/OWL world. So if we can detect contraditions I 
> wouldn't be unhappy, at least from the perspective of us able to 
> validate some data :-)

My point is that the two ontologies needed to describe the two uses of lib:Book above would be contradictory. One would the be left wondering, "what is the semantics of the lib:Book class, is it as defined by ontology A or B?" and that's not good. The truth is that the semantics of lib:Book is less constrained (all the above properties optional), and the constraints appear only to describe certain *records*, not the lib:Book class itself.

> 
> It's really a situation where the cons for an OWL approach (and being 
> awkward to manipulate is not the least default OWL has, sure) could be 
> balanced by some strong pros, and we should not discard them too 
> quickly!


I agree that OWL has advantages, but we need to develop an understanding of what kinds of validation we can achieve.

Maybe we arrive at the conclusion that we need to accept a different set of use cases for OWL-based ontologies. I'm not very happy with that thought, however.

I therefore lean towards a solution based on true syntactical constraint language on RDF graphs. Alistair Miles did some experiments in that direction, but they seem to have disappeared from the net. 

It would be extremely interesting to try to figure out whether there are significant use cases for purely syntactic constraints in the LD domain.

/Mikael
Received on Tuesday, 12 October 2010 13:39:25 UTC