- From: Dan Brickley <danbri@w3.org>
- Date: Thu, 14 Sep 2000 06:27:58 -0400 (EDT)
- To: Lee Jonas <ljonas@acm.org>
- cc: www-rdf-interest@w3.org
On Thu, 14 Sep 2000, Lee Jonas wrote: > This touched a raw nerve for me. Thanks for the feedback. > It is definately not useless. It is an essential part of RDF-Schema that > makes assertions about RDF model validity. Can you walk us through an example of how rdfs:domain as currently defined allows you to validate some RDF model? > The argument seems to be: > a) certainty when inferring a resource's type from the predicates it > exhibits > vs > b) being able to define model validity constraints in a far more refined and > accurate way. > > Arguments: > * b) is fundamental to RDF-Schema and a) is an adjunct > * a resource's type will be given explicitly in alot of cases > * In the absence of an explicit type, and more than one rdfs:domain you can > always infer that the resource is of type rdfs:Resource. Yes, a resources type (or one of its types) will often be given. The Web has a single space of resource identifiers (URIs) but allows many different ways of carving that space up into classes. Correspondingly, RDFS allows a resource to be described as being a member of many classes; not necessarily from the same hierarchy. So there are likely to be more rdf:type statements inferrable than explicitly represented when a resource is mentioned. > * b) is fundamental to RDF-Schema and a) is an adjunct [...] > > As defining model validity is RDFS's primary purpose, and inferring types is > a secondary concern I strongly feel that b) is the way to go. RDFS does address this (validity; constraint checking), but the spec doesn't make this its sole concern. RDFS CR Abstract: This specification describes how to use RDF to describe RDF vocabularies. The specification also defines a basic vocabulary for this purpose, as well as an extensibility mechanism to anticipate future additions to RDF. > > >After all, somewhere out there on the web could be another > >domain statement for foo. We could make a CWA and conclude > >that baz is an instance of Bar and if we ever find another > >domain statement for foo, retract our original conclusion. > > Indeed. > > The point about this is that you can never enforce a single domain (or > indeed range) constraint on every Property definition with such an open RDF > model (where anything can be said about anything) in such an open > environment as the Web (where anyone can say anything). IMHO it's not about enforcing in the sence of policing and punishment while believing any and all RDF statements found on the Web. More about: if I believe the RDF statements I find in Schema s1 and Schema s2, what can I conclude about this heap of instance data. Another scenario that favours the conjunctive approach (which acknowledging the inference/expressivity tradeoff): Say you're an RDF query processor, and you get a query like dc:title(X,Y), s2:techAuthor(X,Y), s2:mbox(Y,Z) etc etc If RDFS semantics allow us to conclude from rdfs:range(s2:techAuthor,Engineer) that Y must be of type s2:Engineer, we can use that as a query-planning hint when deciding how best to consult the database. Say the database keeps stats based on known type membership, eg that it believes 3, or 30, or 30,000 resources to be of type Person, Engineer, Document etc. We can use rdfs:range (though not rdfs:domain as currently defined) to help us answer queries more efficiently. These are only hints btw; there might be some resource known to the database that matches 'Y' in the query about without our having any type information about it. > >The place where the single domain/range requirement hurts > >is when the range or domain of an arc tends to be > >"oddly shaped", i.e., not have a single class corresponding > >to it, but is a conjunct or disjunct of mulitple classes > >where the conjunct/disjunct does not define a > >"natural kind" (as in Quine's use of the term "natural > >kind"). Non-natural kinds are a problem all around. > > I'm not sure what you mean. What is a "natural kind"? Philosophy / Logic / Cognitive Science jargon. The basic idea being that our understanding of the meaning of many natural language terms (water, apples, people, planet etc.) is made easier because the world seems to us to be carved up according to such categories ("natural kinds"), whereas other categories seem more ad-hoc and artificial, ie. artifacts of language rather than being names that point to "real world" categories. This notion relates to schema usability in cases where schema language semantics force vocabulary designers to define ad-hoc placeholder classes that don't intuitively feel like they relate to anything "real". Note that we can (thankfully) remain agnostic about the question of whether the Universe really _does_ come pre-carved into natural kinds; the issue is more to do with the psychology of categorisation. To the extent that schemas define classes that key into our intuitions about worldy categories, we can share a little meaning. But eg. Where we find ourselves defining classes such as 'ReadableOrWearableObject', we've moved beyond the natural kinds. Where we have a class such as 'Water','Lemon' etc., the class definition hooks more easily into the unarticulated shared assumptions we all have about what defines such a class, and into how the world is carved up. We can mine natural language for useful categories; I started doing this with an RDF representation of WordNet, which has since been elaborated on. http://xmlns.com/wordnet/1.6/Water http://xmlns.com/wordnet/1.6/Lemon http://xmlns.com/wordnet/1.6/Planet http://xmlns.com/wordnet/1.6/Person for examples of natural kind terms projected into RDF classes. (slightly broken RDF graphs but you get the idea) Non natural kind terms are more like queries, couched in terms of more stable and intuitive categories. eg... [[ A lovely quote by Borges is given to demonstrate the sort of system that doesn't occur in the classification of animals: `a) those that belong to the emperor, b) embalmed ones, c) those that are trained.....' ]] quoting Rosch in http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Foundations.Cognition/0058.html Um... dunno if that helped. > In summary: > * objectivity about all RDF statements on the entire Web is going to be > impossible - subjectivity (relative to a 'frame of reference') is the only > recourse, therefore conclude stuff based on your viewpoint. But... this notion of a viewpoint conflates 'all the rdf statements I have to hand and that I believe' with a subjective frame of reference that is independent of the current set of statements one knows about. We all agree that we won't believe all possible RDF statements at once, nor that we'll have them loaded into a local database at once. But we want to reason about some local collection of data while bearing in mind that there are more facts out there that we'd believe given half the chance. > * rdfs:domain is _very_ useful as it currently stands for asserting model > validity (albeit based on your current frame of reference). It would be really useful if you could walk us through your current processing model w.r.t. rdfs:domain as currently defined. cheers, Dan
Received on Thursday, 14 September 2000 06:27:57 UTC