- From: Pierre-Antoine Champin <swlists-040405@champin.net>
- Date: Thu, 20 Nov 2008 12:48:01 +0000
- To: Sandro Hawke <sandro@w3.org>, Semantic Web <semantic-web@w3.org>
The problem with using OWL for #2 (i.e. the data model) is the open world assumption. Cardinality axioms in OWL are even trickier than domain and range, for :me a x:HumanRecord ; x:father :my_father . # no explicit mother would not be inconsistent with a (= 1 x:mother) cardinality "constraint" on x:HumanRecord. How would you suggest using OWL to check integrity constraint in the *explicit* triples only? I know an article from Motik et al. [1] about that, but it is not standard OWL... pa [1] http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-258/paper11.pdf Sandro Hawke a écrit : >> Pierre-Antoine Champin wrote: >>> Dan Brickley a =E9crit : >>>> I do recommend against using RDFS/OWL to express application/dataset >>>> constraints, while recognising that there's a real need for recording >>>> them in machine-friendly form. In the Dublin Core world, this topic is >>>> often discussed in terms of "application profiles", meaning that we wa= >> nt >>>> to say things about likely and expected data patterns, rather than doi= >> ng >>>> what RDFS/OWL does and merely offering machine dictionary definitions = >> of >>>> terms. >>> =20 >>> Why would you recommend against it? >>> =20 >>> Would not a good practice be to simply separate in two RDF graphs >>> - "intensional" axioms, those representing the meaning of the terms >>> and that should be assumed by people reusing the vocabulary >>> - "extensional" axioms, those representing properties/constraints of >>> the dataset, that should be used to check its >>> consistency/completeness. >>> =20 >>> Depending on their need, people would only import the first graph, or >>> both of them... >> I guess primarily because it is clearer for everyone if 'domain' and=20 >> 'range' have their conventional meaning, rather than sometimes meaning=20 >> what the W3C groups intended, and sometimes meaning something quite=20 >> different. Since RDF is designed to mix and to flow, keeping the=20 >> dataset-oriented usages separate is likely to be quite hard. >> >> Also I expect dataset-checking applications to have different=20 >> requirements (eg. around optionals, co-occurance constraints, datatype=20 >> values) that simply don't map tidily into RDFS/OWL constructs. Building=20 >> on SPARQL there has some promise I think - eg see=20 >> http://isegserv.itd.rl.ac.uk/schemarama/ >> http://swordfish.rdfweb.org/discovery/2001/01/schemarama/ >> >> On the dataset-characterisation front, there are also efforts like=20 >> http://semwiq.faw.uni-linz.ac.at/node/9 that are worth exploring, also=20 >> http://esw.w3.org/topic/SparqlEndpointDescription2 ... which are=20 >> connected with scenarios around distributed SPARQL query. Again, I don't=20 >> see RDFS/OWL's property-description constructs as being particularly=20 >> attuned to this problem. > > I think it's possible to use OWL to do both these things: > > 1. To describe real world stuff. A human is born with exactly one > biological mother (another human) and exactly one biological > father (another human). This isn't a perfect description; > it's an ontology, a particular written-down conceptualization > about some real world stuff. We could have different > ontologies about human births because we think about them > differently and make different generalizations about them > > 2. To describe the data model required at some computer interface. > Each data-record about a human birth includes zero or one > identifiers for the biological mother (another data-record > about a human) and zero or one identifiers for the biological > father, etc. This probably can be a perfect description; it's > using the ontology language to describe something that's > already abstract. > > The essential difference is in selecting the domain of discourse. What > are the things you're talking about? Are they flesh-and-blood, or > computer abstractions? This is surprisingly hard to do, because those > computer abstractions are intended to represent the flesh-and-blood > entities. System designers have learned to (sometimes) use one in place > of the other, in their reasoning. > > I use this example -- with the cardinality of "mother" -- because it's a > pretty crisp test about which world you're in. In the real world (give > or take origin-of-life issues) every human has exactly one biological > mother. In a data model definition, if you say every person record must > have a valid pointer to another person record, representing the mother, > you're going to have real problem. You'll never be able to construct a > valid data set, database, document, whatever. > > If you're going to use OWL for both #1 and #2, I think it's essential to > keep this distinction clear. Formally, you should have different > classes, something like my:Human and my:HumanRecord. Or, more > practically, x:Human and y:Human, where ontology x is clearly about > people and ontology y is a system interface definition, a data model. > > Once we're clear about this distinction, we can get a better sense of > whether or not OWL is a good language for #2, or if people should stick > to XML Schema, UML, database schema systems, etc. > > (The reason we're drawn to using OWL for expressiong data models, of > course, is that we're exchanging data in RDF, so the fit is pretty darn > good, and none of the other tools work well. The problem may just be > this confusion about whether we're modeling the world or the data > structures, or it may that OWL really isn't suitable for this. Once > we're past the confusion, maybe we can find out.) > > -- Sandro > > (disclaimer: I am W3C staff contact for the OWL Working Group, > but in no way speaking on behalf of that group -- I don't recall > the group ever talking about this issue, and I'm certainly not > representing them in this thread.)
Received on Thursday, 20 November 2008 12:48:51 UTC