- From: Frank Manola <fmanola@mitre.org>
- Date: Thu, 29 Aug 2002 08:44:27 -0400
- To: graham wideman <graham@wideman-one.com>
- CC: Brian McBride <bwm@hplb.hpl.hp.com>, www-rdf-comments@w3.org, phayes@ai.uwf.edu
Graham--
Thanks for the comments. Responses below.
graham wideman wrote:
> Frank:
>
> Thanks for your detailed comments. You've at least not alerted me to something that I had completely missed! Now to the two issues under consideration:
>
I'm glad to hear that!
> 1. The "Property implies class" Problem
> ----------------------------------------
> Much of what you (and various RDF docs) say about the impact of rdfe:domain is along the lines of:
>
>
>>we're not *forcing* resources to become instances of classes by making
>>rdfs:domain declarations;
>>
>
> ... suggesting that my private app could regard the rdfs:domain statements any way it likes, in particular that it can employ the rdfs:domain and related features to support the constraints that I outline.
>
> But your assertion ("not forcing") seems incompatible with what the docs say. The latest Primer, I now see, is in line with the Model Theory (2002-04-29) doc section 4.2, which provides rule rdfs2:
>
> if E contains:
> xxx aaa yyy
> aaa [rdfs:domain] zzz
>
> then add:
> xxx [rdfs:type] zzz
Certainly one of the points of the Primer rewrite was to have the words
be more clearly consistent with the Model Theory.
>
> To me this says that if I claim that my app is using ("conforming to") RDFS, then it has to abide by this entailment. So, when I get to your example:
>
Not quite. It might DO the entailment, but it doesn't have to "abide
by" it. More on this below.
>
>>ex:author rdfs:domain ex:Book
>>
>>ex:Garfinkle rdf:type ex:Cat
>>ex:Garfinkle ex:author "Fred Smith"
>>
>>[... must the app conclude the following?...]
>>ex:Garfinkle rdf:type ex:Book
>>
>
>>your processing application could say a number of things:
>>
>>1. The instance data must be wrong: Garfinkle must be a Book rather
>>than a Cat
>>2. The instance data must be wrong: we'll say that it's invalid
>>3. The schema must be wrong: Cats can clearly have authors too
>>4. Everyone's right: Garfinkle is both a Book and a Cat
>>
>
> ... I believe that the RDFS entailment above requires my app to decide 4, "Garfinkle is both a Book and a Cat" if it wants to claim compliance with RDFS.
>
> If you are saying that my supposedly RDFS-complying app is free *not* to abide by the entailment above, then what is the distinction between RDFS rules that are required, and RDFS rules that are not? Perhaps my app could freely ignore subClassOf and rdf:type as well and still be an OK RDFS app?
>
Pat may want to put this differently (or even contradict it), and maybe
I didn't put this well to start with, but here's my take on this: The
closure rules defined in the Model Theory (one of which you've quoted
above) effectively say "here's the additional information that you can
infer based on (a) the information in the schema triples and (b) the
instance data (RDF graph) you're considering." The result of doing
these inferences is an expanded graph that effectively contains all the
information you then have available. However, those rules don't require
that your processor "abide by" those inferences in preference to other
information in the resulting graph. Graphs can contain information that
appears odd (or just plain wrong) to an application. In this case, the
rules say that, based on the combination of schema information and
instance data, you (at least conceptually) have
[schema] ex:author rdfs:domain ex:Book
[instance data] ex:Garfinkle rdf:type ex:Cat
[instance data] ex:Garfinkle ex:author "Fred Smith"
[inference] ex:Garfinkle rdf:type ex:Book
But the rules don't say what you must do with this information. In
particular, the rules don't say your application must ignore that its
intention is to treat the schema information as a constraint, and go
with the inference instead. All the rules say is, in effect, "this is
what the combination of the schema and your instance data imply." The
situation here, as far as the application is concerned, is that it's got
what it considers an anomaly: that Garfinkle is both a Cat and a Book.
We can't declare that Cats and Books are disjoint in RDFS (we don't
have the language for it), but the application, written to enforce that
constraint, can look at this information and raise an error.
A lot of the schema constraint checking capability that you (I think)
have in mind involves a processor making a lot of additional assumptions
that, in processing data from potentially multiple sources on the Web,
we don't want to build into the definition of RDFS. For example, lots
of type checking systems would say that if you define a particular
property as applying to a given type (like authors to Books), a Book
would be an illegal instance if it appeared in instance data without an
author property. But that's an *additional* assumption, not one that
necessarily follows from saying that authors applies to Books.
Similarly, lots of type checking systems would say that if you define a
type with a specific set of properties, if an instance appears with a
property not defined in that set, it's an illegal instance. But that
involves the additional assumption that you've specified (and that the
processor has found on the Web) *only* the properties that you intend to
apply to that type. Many of these sorts of assumptions assume more of a
closed world than we can afford to assume in RDFS (although a particular
application may well enforce those constraints). What we've done in
RDFS is define a fairly minimal facility for indicating what properties
people intend to use for what types, leaving it up to applications to
decide how they want to use that information (and enforcing the
consequences of those intentions).
However, really the basis of all constraint checking is first
determining that there's an anomaly: the schema says this, and the
instance data says that, and there's an apparent conflict. What we're
doing is leaving the conflict resolution to the application, rather than
building it in (partly because we don't always have a way in RDFS to
even state what constitutes a conflict, e.g., that Cats can't be Books).
> 2. The Multi-Class Domain Problem
> ----------------------------------
> Thanks, you provided a nice explanation of how this arises as a result of:
>
> a) The above "Property implies Class" semantics, and
> b) Multi rdfs:domain statements are individual assertions which must all be individually satisfied.
>
> I continue to regard the result as a fatal flaw, but it now strikes me as a secondary problem.
>
> FWIW, if the only problem were the need to be able to spec that a Property can apply to instances of *any* of a list of classes rather than *all* of a list of classes, surely RDF has available syntax that the Schema spec could employ for this? (Maybe ALT fits in here... I haven't thought it through other than to hope that RDF is capable enough that this simple matter would be trivial to cast in RDF(S)...)
>
> 3. "Property implies class": Revisited
> --------------------------------------
>
> In the real world we often (usually?) classify objects based on their "properties":
>
> 4-legs, furry, barks --> class = dog
> 4-legs, furry, meows --> class = cat
> flippers, furry, barks --> class = seal
>
> It is a *special case* where we can determine class membership based on only a single property.
>
> Hence, IMO, RDFS's prescription that each single property determines class membership independent of other properties is significantly at odds with the real world objects that RDF is designed to talk about.
>
> I'm increasingly convinced that this leads to several consequences (already noted) that prevent use of rdfs:domain and related features as the basis of any useful functionality at all.
>
> Here's what I'd need to convince me otherwise -- and I'm hoping that somebody can point me to docs where these were already thought through in the process of devising the rdfs:domain etc features:
>
> a) rdfs:domain Applied Usefully: An example where rdfs:domain does record the specs necessary to support *any* useful non-trivial constraining of properties to instances of particular classes.
>
> b) An Important Example Implemented: An example where constraints for a couple of tables (in the database sense), are specified by rdfs:domain statements. Particularly where the tables have a field/property in common.
>
> Such use cases would really be proof of the pudding... or proof that there's at least some pudding!
>
I'm not sure what you're after here (particularly about (b)). There are
a number of existing applications that use (I assume "usefully") RDFS to
define their structures, and I imagine they use the RDFS specifications,
at least to some extent, to specify constraints just as you wish. The
point isn't that you *can't* use RDFS schemas in that way, it's just
that we want the applications to make the decisions about how (and to
what extent) to resolve these issues, and I assume they do in a way that
suits their intended purposes. Certainly it's possible (and not a
"violation" of RDFS) to build an application that would use the example
information we've been discussing to conclude that what someone's said
about Garfinkle having an author must be wrong.
Is this helping?
--Frank
--
Frank Manola The MITRE Corporation
202 Burlington Road, MS A345 Bedford, MA 01730-1420
mailto:fmanola@mitre.org voice: 781-271-8147 FAX: 781-271-875
Received on Thursday, 29 August 2002 08:34:00 UTC