- From: Sandro Hawke <sandro@w3.org>
- Date: Tue, 24 Apr 2012 22:27:43 -0400
- To: Pat Hayes <phayes@ihmc.us>
- Cc: W3C RDF WG <public-rdf-wg@w3.org>
[This is a brainstorming discussion, maybe off-topic for the WG.] On Mon, 2012-04-23 at 00:45 -0700, Pat Hayes wrote: > First, regrets for next Wednesday, I will be driving through Texas. I drove across the panhandle once. Surely, you can just lash the wheel, put a rock on the gas pedal, and stretch out in the back seat to talk to us..... (not that we'll be talking about this, from what I gather.) > Second, I have written up essentially the same proposal in a slightly different terminology which might (?) be more palatable, anyway it is there for inspection at http://www.w3.org/2011/rdf-wg/wiki/AnotherSpin I'm just going to respond, right now, to the middle of Why Bother, because I think everything flows from there. As background, I subscribe to the philosophy that the meaning of an RDF graph depends on the meanings of the predicates it uses. So, to extend the semantics of RDF to include equality, you just *use* owl:sameAs. To extend the semantics of RDF to include subproperty reasoning, you just *use* the predicate rdfs:subClassOf. This seems very simple and elegant, although I will grant it has not been well understood or well deployed, to date, and isn't exactly how the specs are written. I don't see a need for rdf:inherits -- I think the semantics should, essentially, automatically 'inherit' each predicate. I think this should be implemented, in the general (low performance) case, by having clients download inference rules from the predicate URLs. [1] Reasoners can be specialized to use a tableau algorithm, for example, instead of running rules, if the result will be the same (or predictably and usefully different, I guess). Trying to understand the Why Bother section... I like the chemistry example. As I understand the science: there was Carbon, pretty well understood, and then suddenly a hundred years ago, it started to look like there was actually Carbon 12, Carbon 13, Carbon 14, and more. Unseen, unknown, until radioactivity was understood. So, in RDF, we can imagine there's lots of data about chem:Carbon, and then suddenly, starting around 1912, we have iso:Carbon12, iso:Carbon13, etc. Statistically, 99% of chem:Carbon is iso:Carbon12. Only 0.0000000001% [2] of the chem:Carbon is iso:Carbon14, but still that little bit is enormously useful, eg for carbon dating. Chemical formulas don't care about isotopes: propane is C3H8, whichever carbon isotopes are used in it. When we're just doing normal chemistry, we want to ignore isotopes; when we're doing carbon dating, or looking at precise atomic weights, we do not. So, how do we do this in RDF? Can we do it without contexts? Alice is an organic chemist, and all her software uses the chem: vocabulary. Her experimental results all talk about chem:Carbon. She doesn't care about isotopes. Bob is a radiochemist. Pretty much all of his software uses the iso: vocabulary, because he works with specific isotopes. His data includes lots of references to iso:Carbon12, and other isotopes. Things get interesting when Bob wants to use some of Alice's data, or Alice wants to use some of Bob's data. First attempt: use an OWL reasoner on the data, and mix in the "facts" that: chem:Carbon owl:sameAs iso:Carbon12, iso:Carbon13, iso:Carbon14. This will probably work for some things, and completely break for others. The chemical properties are the same, so that sounds okay. Nearly everything Alice findsa to be true for chem:Carbon is almost certainly also true for each of the carbon isotopes. However, there are some properties (atomic weights, number of neutrons, rate of radioactive decay) which are different. If some of that is given in the iso ontology, the reasoner will quickly determine (if it looks) that the combined ontology is inconsistent. We'd like something a little more like a subclass relation. If these ontologies treated substances as classes containing, as instances, items composed of that substance, the mixin 'facts' could be: iso:Carbon12 rdfs:subClassOf chem:Carbon That's much closer to true. It could even be true, if chem:Carbon were designed with an understanding of isotopes, but in our scenario, it wasn't. With the new understanding of carbon, the atomic weight is defined as that of carbon-12, but before they understood about isotopes, the atomic weight ended up being the combined atomic weights of the isotopes found in the carbon the chemists worked with. So our naive chem:Carbon definition gives it a single atomic weight, which is not the same as any of the isotopes. so that subclass rule isn't quite right either. Maybe that was Alice's experiment, measuring the atomic weight of Carbon. She doesn't know about isotopes, so her answer wont be exactly the weight of any of the isotopes; it'll be the weight of whatever combination she happened to use. We have a world here where the ontologies line up 99% and maybe with some work we can get another .9% or maybe .99%. But never 100%. There's carbon-12 and there's "carbon", which is a mixture of the various carbon isotopes as they happened to occur for that particular observer. Can contexts help with this? Is it better to use the same term for these two concepts? I wouldn't think so. I think the answer here is to use "shims", such as these "facts" I've used above. I think people should publish various shims, for various purposes. The shims are web documents which say how to map from data in one vocabulary to data with a similar meaning in another vocabulary. I'm not sure how much they should be OWL vs RIF vs Javascript (using some convention not-yet-determined). I think the shims will have to be labeled in ways that help people, and sometimes machines, figure out which ones are best for their purposes and understand in what ways they are wrong/broken. Maybe Bob writes one that allows him to use Alice's data, then publishes it for others to use under similar circumstances. I can see how to do that as long as we use different IRIs for the different notions of carbon. If we used the same IRI but labeled the graphs in some way, I think it would get harder. (Change-over-time is a different use case, which I'd like to talk about separately.) -- Sandro [1] Pat, I think you were in the room when I gave this talk. I don't remember your reaction. http://www.w3.org/2009/Talks/1026-semrus/#%2831%29 [2] I love wikipedia.
Received on Wednesday, 25 April 2012 02:27:53 UTC