Re: Returning to OWL and application profiles from Mark van Assem on 2010-10-13 (public-lld@w3.org from October 2010)

From: Mark van Assem <mark@cs.vu.nl>
Date: Wed, 13 Oct 2010 10:41:05 +0200
To: Karen Coyle <kcoyle@kcoyle.net>
CC: Antoine Isaac <aisaac@few.vu.nl>, public-lld <public-lld@w3.org>
Message-ID: <4CB570A1.4070301@cs.vu.nl>
  Hi,

> Yes, I agree with this. I do still worry that having every combination 
> of constraints be a new class will become un-usable. So a usage 
> restriction of 3 plus a value vocabulary v. usage restriction of 3 and 
> no stated vocabulary; a restriction of 2 plus that same value 
> vocabulary; a restriction of 5 and vocabulary or no vocabulary -- it 
> gets out of hand pretty easily. (Note: in one system I recall having 
> to limit author names to 99 because there were records that exceeded 
> that!). It's not a matter of whether it CAN be done, 

I don't get your point. The new class is just a placeholder, stating 
what the constraints on its superclass should be *within this particular 
application*. Such a placeholder is needed in any situation where you 
want to reuse existing classes and properties, but constrain them 
"locally". It's an application-specific version of the generic class, 
only to be used within that application.

Your example concerning having to limit author names is I think not 
related to the issue. If your database can have that many names, and 
that is allowed, then that's simply the way it is. That's a feature of 
your dataset that you apparantly condone, so then there's no point in 
putting a restriction anyway.

Mark.


> but whether doing it this way becomes a hindrance to metadata creation.
>
> kc
>
>
>
>>
>>
>> Note that in relation to your last sentence I've used "ontology" for
>> both the domain vocabulary and the one for AP1. The SW stack assigns no
>> intrinsic commitment wrt. a given application level for ontologies:
>> ontologies are meant to define constraints or data production rules,
>> irrespective of whether these will be used for an entire domain or a
>> specific application.
>>
>> Antoine
>>
>>
>>>
>>> Quoting Emmanuelle Bermes <manue.fig@gmail.com>:
>>>
>>>> On Tue, Oct 12, 2010 at 4:08 PM, Mark van Assem <mark@cs.vu.nl> wrote:
>>>>> Hi Mikael,
>>>>>
>>>>> I just defended my PhD thesis [1] last week and it contains a section
>>>>> 7.5
>>>>> devoted to APs specified in OWL (also referring to your AP constraint
>>>>> language). I suggest you have a read :-)
>>>>>
>>>>> What I propose is to create subclasses of a particular class such as
>>>>> lib:Book, e.g. my:Book and constrain property values on my:Book so
>>>>> that no
>>>>> inconsistency arises with his:Book (which may have entirely different
>>>>> constraints).
>>>>
>>>> This approach seems very good from the modelling point of view, but
>>>> I'd like to ask wether it is realistic in a pragmatic Linked Data
>>>> world.
>>>>
>>>> If I want to add specific constraints on, for instance, dct:creator,
>>>> and I create my:creator, I will reach interoperability with others
>>>> using dct:creator only through inferencing. This doesn't seem very
>>>> straightforward to me. We are lacking actual use of Linked Data today,
>>>> and I feel that adding more complexity in the data model is likely to
>>>> create more barriers.
>>>>
>>>> As librarians sitting on a big mass of data, our immediate need is to
>>>> be make our data understandable in the global Linked Data world, and
>>>> I'm not sure that encouraging us to systematically create our own
>>>> classes and properties rather that reusing existing ones, for the sake
>>>> of expressing patterns, is the right way to go. I see a risk to
>>>> encourage the creation of a lot of redundant, slightly different but
>>>> almost similar vocabularies (we already have 3 flavours of FRBR out
>>>> there...)
>>>>
>>>> What I like in the application profile approach is the idea that I can
>>>> reuse *existing* classes & properties, and at the same time, express
>>>> the pattern that my community should reuse for describing similar
>>>> resources.
>>>> It's actually 2 separate needs :
>>>> - a need for visibility and interoperability, that can be fulfilled by
>>>> existing vocabularies with their OWL semantics that are consistent
>>>> globally
>>>> - a need for describing domain-specific patterns.
>>>>
>>>>
>>>> We call this "collection-specific value ranges" (with
>>>>> collections referring to cultural heritage collections), but 
>>>>> cardinality
>>>>> constraints is basically the same story.
>>>>>
>>>>> The only thing that then has to be added is a way to classify every
>>>>> book as
>>>>> either my:Book or something else (the "something elses" representing
>>>>> wrongly
>>>>> specified books, i.e. violations of the AP constraints). This is
>>>>> probably
>>>>> abusing OWL a bit, but it can probably be done.
>>>>>
>>>>> An alternative is to just accept OWL as a "syntax" for the AP
>>>>> constraints,
>>>>> and implement your own checker on top of that. This removes the 
>>>>> need to
>>>>> develop your own language (and parser) which will contain almost the
>>>>> same
>>>>> syntactical elements anyway.
>>>>
>>>> I like the idea that the AP should be something that could be
>>>> implemented following different syntaxes, maybe including OWL, but not
>>>> excluding other approaches that wouldn't make it mandatory to declare
>>>> local properties and classes systematically when additional semantics
>>>> or constraints are needed.
>>>>
>>>> Emmanuelle
>>>>>
>>>>> (As far as I can tell this is also what DC-APs are intended to do. 
>>>>> I had
>>>>> discussions with Tom Baker about this, and he was part of the
>>>>> committee that
>>>>> accepted my thesis last week so apparently I didn't write rubbish :-)
>>>>>
>>>>> Best,
>>>>> Mark
>>>>> [1]http://www.cs.vu.nl/~mark/papers/thesis-mfjvanassem.pdf
>>>>>
>>>>> On 12/10/2010 13:50, Mikael Nilsson wrote:
>>>>>>
>>>>>> Hi Antoine,
>>>>>>
>>>>>> tis 2010-10-12 klockan 13:11 +0200 skrev Antoine Isaac:
>>>>>>>
>>>>>>> Hi Mikael,
>>>>>>>
>>>>>>> Thanks for starting this interesting thread :-)
>>>>>>
>>>>>> Thanks for an interesting reply :)
>>>>>>
>>>>>>>
>>>>>>> Well, I have what I think is a quite traditional SW background, and
>>>>>>> I'd
>>>>>>> be tempted to turn the argument the other way round ;-)
>>>>>>> If the instances of one class in an AP have one title and the
>>>>>>> instances
>>>>>>> of that class in another AP have two titles, then it is perhaps
>>>>>>> that these
>>>>>>> two APs are thinking of two different classes, really. Possibly
>>>>>>> they could
>>>>>>> be two subclasses of a common superclass, but two different
>>>>>>> classes, still.
>>>>>>> I think this is quite in line with Jeff's message suggesting you
>>>>>>> could do
>>>>>>> things like this
>>>>>>> baz:Widget rdfs:subClassOf foo:Widget.
>>>>>>> to make the commitment of your various APs a bit clearer.
>>>>>>
>>>>>> Yes, I do believe that conceptually, most APs can be described using
>>>>>> the
>>>>>> trick of the creation of subclasses, even though the subclass may 
>>>>>> well
>>>>>> consist of the same individuals (!)
>>>>>>
>>>>>>>
>>>>>>> I'm also a bit puzzled by your "APs define domain-specific 
>>>>>>> structural
>>>>>>> constraints while OWL adds semantics to existing classes." 
>>>>>>> Again, I am
>>>>>>> pretty new to the APs as practiced in the DC realm, but why
>>>>>>> wouldn't you
>>>>>>> consider classes (and APs built on top of them) from a (SW) 
>>>>>>> semantic
>>>>>>> perspective?
>>>>>>> I understand that OWL and RDF will fail for
>>>>>>> arrangement/presentation of
>>>>>>> data (order of XML elements, e.g.), which is not really about
>>>>>>> semantics. But
>>>>>>> to me--and to many in the SW community--cardinality belongs to
>>>>>>> semantics of
>>>>>>> classes and properties.
>>>>>>
>>>>>> Let me give a simple example. Let's assume I am designing a 
>>>>>> simple REST
>>>>>> service for retrieving metadata records from a library repository
>>>>>> (never
>>>>>> mind the 303s etc).
>>>>>>
>>>>>> I decide to return Turtle representations of the form
>>>>>>
>>>>>> myrepo:book123 rdf:type lib:Book,
>>>>>>               dct:title "Moby Dick",
>>>>>>               dct:creator myrepo:author345 .
>>>>>>
>>>>>>
>>>>>> I can define this pattern using OWL restrictions on the lib:Book 
>>>>>> class.
>>>>>>
>>>>>> Later on, I want to define an extended API for use by library 
>>>>>> partners.
>>>>>> This API returns records of the form
>>>>>>
>>>>>> myrepo:book123 rdf:type lib:Book,
>>>>>>               dct:title "Moby Dick",
>>>>>>               dct:creator myrepo:author345,
>>>>>>               lib:numCopies "5".
>>>>>>
>>>>>>
>>>>>> This record describes the exact same individual but using a 
>>>>>> different
>>>>>> application profile.
>>>>>>
>>>>>> To solve this with the subclass method I would need to define a
>>>>>> subclass
>>>>>> lib:ExtendedBook that captures the additional property, but has the
>>>>>> same
>>>>>> extension.
>>>>>>
>>>>>> My point is that application profiles describe metadata *patterns*,
>>>>>> independently of the semantics. None of the above properties are 
>>>>>> likely
>>>>>> required (cardinality>  0) for the Book class in any case. And there
>>>>>> may
>>>>>> well be multiple such patterns for the same class or set of things.
>>>>>>
>>>>>> In the DC world, the use of application profiles is often framed in
>>>>>> this
>>>>>> way - an application profile does not describe a class, it describes
>>>>>> the
>>>>>> metadata records emitted or accepted by a system (hence the term
>>>>>> "application").
>>>>>>
>>>>>>>> But that's exactly it - the semantics is "alternative" and on its
>>>>>>>> face,
>>>>>>>> the semantics of the published OWL file is something else 
>>>>>>>> entirely.
>>>>>>>>
>>>>>>>> As a concrete example, if I serve an OWL file from my web server,
>>>>>>>> using
>>>>>>>> application/rdf+xml as suggested by the OWL specs [1], the
>>>>>>>> interpretation will be as RDF triples using the RDF and OWL 
>>>>>>>> built-in
>>>>>>>> semantics, thus resulting in the generation of new triples,
>>>>>>>> potentially
>>>>>>>> contradiction with other ontologies, and not in validation as
>>>>>>>> expected.
>>>>>>>
>>>>>>>
>>>>>>> Well, contradiction (aka inconsistency) is what is used for data
>>>>>>> validation in the RDF/OWL world. So if we can detect 
>>>>>>> contraditions I
>>>>>>> wouldn't be unhappy, at least from the perspective of us able to
>>>>>>> validate some data :-)
>>>>>>
>>>>>> My point is that the two ontologies needed to describe the two 
>>>>>> uses of
>>>>>> lib:Book above would be contradictory. One would the be left 
>>>>>> wondering,
>>>>>> "what is the semantics of the lib:Book class, is it as defined by
>>>>>> ontology A or B?" and that's not good. The truth is that the 
>>>>>> semantics
>>>>>> of lib:Book is less constrained (all the above properties optional),
>>>>>> and
>>>>>> the constraints appear only to describe certain *records*, not the
>>>>>> lib:Book class itself.
>>>>>>
>>>>>>>
>>>>>>> It's really a situation where the cons for an OWL approach (and 
>>>>>>> being
>>>>>>> awkward to manipulate is not the least default OWL has, sure) 
>>>>>>> could be
>>>>>>> balanced by some strong pros, and we should not discard them too
>>>>>>> quickly!
>>>>>>
>>>>>>
>>>>>> I agree that OWL has advantages, but we need to develop an
>>>>>> understanding
>>>>>> of what kinds of validation we can achieve.
>>>>>>
>>>>>> Maybe we arrive at the conclusion that we need to accept a different
>>>>>> set
>>>>>> of use cases for OWL-based ontologies. I'm not very happy with that
>>>>>> thought, however.
>>>>>>
>>>>>> I therefore lean towards a solution based on true syntactical
>>>>>> constraint
>>>>>> language on RDF graphs. Alistair Miles did some experiments in that
>>>>>> direction, but they seem to have disappeared from the net.
>>>>>>
>>>>>> It would be extremely interesting to try to figure out whether there
>>>>>> are
>>>>>> significant use cases for purely syntactic constraints in the LD
>>>>>> domain.
>>>>>>
>>>>>> /Mikael
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>
>
>
Received on Wednesday, 13 October 2010 08:42:12 UTC