- From: Frank Manola <fmanola@mitre.org>
- Date: Thu, 29 Aug 2002 08:44:27 -0400
- To: graham wideman <graham@wideman-one.com>
- CC: Brian McBride <bwm@hplb.hpl.hp.com>, www-rdf-comments@w3.org, phayes@ai.uwf.edu
Graham-- Thanks for the comments. Responses below. graham wideman wrote: > Frank: > > Thanks for your detailed comments. You've at least not alerted me to something that I had completely missed! Now to the two issues under consideration: > I'm glad to hear that! > 1. The "Property implies class" Problem > ---------------------------------------- > Much of what you (and various RDF docs) say about the impact of rdfe:domain is along the lines of: > > >>we're not *forcing* resources to become instances of classes by making >>rdfs:domain declarations; >> > > ... suggesting that my private app could regard the rdfs:domain statements any way it likes, in particular that it can employ the rdfs:domain and related features to support the constraints that I outline. > > But your assertion ("not forcing") seems incompatible with what the docs say. The latest Primer, I now see, is in line with the Model Theory (2002-04-29) doc section 4.2, which provides rule rdfs2: > > if E contains: > xxx aaa yyy > aaa [rdfs:domain] zzz > > then add: > xxx [rdfs:type] zzz Certainly one of the points of the Primer rewrite was to have the words be more clearly consistent with the Model Theory. > > To me this says that if I claim that my app is using ("conforming to") RDFS, then it has to abide by this entailment. So, when I get to your example: > Not quite. It might DO the entailment, but it doesn't have to "abide by" it. More on this below. > >>ex:author rdfs:domain ex:Book >> >>ex:Garfinkle rdf:type ex:Cat >>ex:Garfinkle ex:author "Fred Smith" >> >>[... must the app conclude the following?...] >>ex:Garfinkle rdf:type ex:Book >> > >>your processing application could say a number of things: >> >>1. The instance data must be wrong: Garfinkle must be a Book rather >>than a Cat >>2. The instance data must be wrong: we'll say that it's invalid >>3. The schema must be wrong: Cats can clearly have authors too >>4. Everyone's right: Garfinkle is both a Book and a Cat >> > > ... I believe that the RDFS entailment above requires my app to decide 4, "Garfinkle is both a Book and a Cat" if it wants to claim compliance with RDFS. > > If you are saying that my supposedly RDFS-complying app is free *not* to abide by the entailment above, then what is the distinction between RDFS rules that are required, and RDFS rules that are not? Perhaps my app could freely ignore subClassOf and rdf:type as well and still be an OK RDFS app? > Pat may want to put this differently (or even contradict it), and maybe I didn't put this well to start with, but here's my take on this: The closure rules defined in the Model Theory (one of which you've quoted above) effectively say "here's the additional information that you can infer based on (a) the information in the schema triples and (b) the instance data (RDF graph) you're considering." The result of doing these inferences is an expanded graph that effectively contains all the information you then have available. However, those rules don't require that your processor "abide by" those inferences in preference to other information in the resulting graph. Graphs can contain information that appears odd (or just plain wrong) to an application. In this case, the rules say that, based on the combination of schema information and instance data, you (at least conceptually) have [schema] ex:author rdfs:domain ex:Book [instance data] ex:Garfinkle rdf:type ex:Cat [instance data] ex:Garfinkle ex:author "Fred Smith" [inference] ex:Garfinkle rdf:type ex:Book But the rules don't say what you must do with this information. In particular, the rules don't say your application must ignore that its intention is to treat the schema information as a constraint, and go with the inference instead. All the rules say is, in effect, "this is what the combination of the schema and your instance data imply." The situation here, as far as the application is concerned, is that it's got what it considers an anomaly: that Garfinkle is both a Cat and a Book. We can't declare that Cats and Books are disjoint in RDFS (we don't have the language for it), but the application, written to enforce that constraint, can look at this information and raise an error. A lot of the schema constraint checking capability that you (I think) have in mind involves a processor making a lot of additional assumptions that, in processing data from potentially multiple sources on the Web, we don't want to build into the definition of RDFS. For example, lots of type checking systems would say that if you define a particular property as applying to a given type (like authors to Books), a Book would be an illegal instance if it appeared in instance data without an author property. But that's an *additional* assumption, not one that necessarily follows from saying that authors applies to Books. Similarly, lots of type checking systems would say that if you define a type with a specific set of properties, if an instance appears with a property not defined in that set, it's an illegal instance. But that involves the additional assumption that you've specified (and that the processor has found on the Web) *only* the properties that you intend to apply to that type. Many of these sorts of assumptions assume more of a closed world than we can afford to assume in RDFS (although a particular application may well enforce those constraints). What we've done in RDFS is define a fairly minimal facility for indicating what properties people intend to use for what types, leaving it up to applications to decide how they want to use that information (and enforcing the consequences of those intentions). However, really the basis of all constraint checking is first determining that there's an anomaly: the schema says this, and the instance data says that, and there's an apparent conflict. What we're doing is leaving the conflict resolution to the application, rather than building it in (partly because we don't always have a way in RDFS to even state what constitutes a conflict, e.g., that Cats can't be Books). > 2. The Multi-Class Domain Problem > ---------------------------------- > Thanks, you provided a nice explanation of how this arises as a result of: > > a) The above "Property implies Class" semantics, and > b) Multi rdfs:domain statements are individual assertions which must all be individually satisfied. > > I continue to regard the result as a fatal flaw, but it now strikes me as a secondary problem. > > FWIW, if the only problem were the need to be able to spec that a Property can apply to instances of *any* of a list of classes rather than *all* of a list of classes, surely RDF has available syntax that the Schema spec could employ for this? (Maybe ALT fits in here... I haven't thought it through other than to hope that RDF is capable enough that this simple matter would be trivial to cast in RDF(S)...) > > 3. "Property implies class": Revisited > -------------------------------------- > > In the real world we often (usually?) classify objects based on their "properties": > > 4-legs, furry, barks --> class = dog > 4-legs, furry, meows --> class = cat > flippers, furry, barks --> class = seal > > It is a *special case* where we can determine class membership based on only a single property. > > Hence, IMO, RDFS's prescription that each single property determines class membership independent of other properties is significantly at odds with the real world objects that RDF is designed to talk about. > > I'm increasingly convinced that this leads to several consequences (already noted) that prevent use of rdfs:domain and related features as the basis of any useful functionality at all. > > Here's what I'd need to convince me otherwise -- and I'm hoping that somebody can point me to docs where these were already thought through in the process of devising the rdfs:domain etc features: > > a) rdfs:domain Applied Usefully: An example where rdfs:domain does record the specs necessary to support *any* useful non-trivial constraining of properties to instances of particular classes. > > b) An Important Example Implemented: An example where constraints for a couple of tables (in the database sense), are specified by rdfs:domain statements. Particularly where the tables have a field/property in common. > > Such use cases would really be proof of the pudding... or proof that there's at least some pudding! > I'm not sure what you're after here (particularly about (b)). There are a number of existing applications that use (I assume "usefully") RDFS to define their structures, and I imagine they use the RDFS specifications, at least to some extent, to specify constraints just as you wish. The point isn't that you *can't* use RDFS schemas in that way, it's just that we want the applications to make the decisions about how (and to what extent) to resolve these issues, and I assume they do in a way that suits their intended purposes. Certainly it's possible (and not a "violation" of RDFS) to build an application that would use the example information we've been discussing to conclude that what someone's said about Garfinkle having an author must be wrong. Is this helping? --Frank -- Frank Manola The MITRE Corporation 202 Burlington Road, MS A345 Bedford, MA 01730-1420 mailto:fmanola@mitre.org voice: 781-271-8147 FAX: 781-271-875
Received on Thursday, 29 August 2002 08:34:00 UTC