- From: Alan Ruttenberg <alanruttenberg@gmail.com>
- Date: Tue, 10 Feb 2009 02:13:51 -0500
- To: Richard Newman <rnewman@twinql.com>
- Cc: Jiri Prochazka <ojirio@gmail.com>, semantic-web@w3.org
On Tue, Feb 10, 2009 at 12:32 AM, Richard Newman <rnewman@twinql.com> wrote: >> I find your remark very odd, and more demonstrative of a lack of >> experience than an accurate perception of the state or vision of the >> Semantic Web. Certainly the lack of customers using OWL becomes a >> self-fulfilling prophecy when such a point of view is held. > > I was merely stating my experience: in my (... 6? Blimey.) years in the SW > community, I'd say the ratio of organizations I've worked with using RDF for > storage versus simple reasoning versus OWL reasoning is approximately > 10:5:1. Granted, my recent work has been on a system that doesn't currently > offer OWL reasoning (partly because of a lack of demand from customers: > RDFS++ has been adequate), but we do stay in touch with a wide variety of > people, including the RACER folks. > > That's not to say that OWL *vocabulary* isn't used; after all, why bother > making up your own sameAs property? Good. Because what I tried to bring attention to was that OWL was bringing new expressivity (i.e. new vocabulary and patterns) to the annotation space, and your response was easily read as dismissive of this. > I'm simply saying that folks trying to > use OWL-DL (and up) reasoners on *real datasets and systems* (as opposed to > things like my occasional playing around with Pellet) are significantly > outnumbered by those dumping big datasets into RDF, and tooling for > large-scale RDF systems is more widely available than tooling for OWL > systems on the same scale. > > The implication of that is that a solution in OWL 2 is not a solution for > the majority of people: their tooling doesn't support it, or reasoning won't > scale to their datasets, or they have to interoperate with others who aren't > using it. OWL 2 isn't *currently* a solution to all problems, nor would I recommend it be. However your previous note implied that it wasn't worth taking OWL into account, or even considering using the vocabulary and I felt this needed to be corrected. On your experience with 10:5:1, this doesn't sound too off, but considering the youth of OWL I think it's significant, and I don't think it can be extrapolated as being static into the future. I think it's fair to say that Science Commons, where I work, is building for the future, and we've given some thought to our choice of OWL. >> OWL is widely deployed in the area I work on the Semantic Web for >> science, with our own Neurocommons being a 400M triple store expressed >> in OWL and many other projects using OWL. > > Can I ask what level of reasoning you apply to Neurocommons? It's a mixture and it is evolving. Some of the reasoning is at the smaller chunk level, as validation of the conversion to OWL - inconsistencies at that level are detected and fixed in the conversion script. In other cases inferences are computed by pellet, saved to a file, and loaded in to the store. In the store itself, which uses Virtuso, we focus on propagating subclass and part_of relations (expressed as restrictions in OWL). While we don't represent that full OWL reasoning is done at the whole store level yet, there are nonetheless benefits in using OWL. First, the expressivity is greater and we can say, within spec, more clearly what we mean (part_of at the class level is an example). Second, as I mention, portions can be reasoned over exactly and this is used to improve quality. Finally, we don't modify our knowledge representation to suit our technology. I've seen many cases where people optimize their RDF for query performance. I think this is a loss in the long run as technology changes over time. I'd rather interact with the OWL and store developers to run better on a representation that I don't expect to have to change radically at any point. Because this representation has a stronger chance of being stable, we feel it is more likely that we will be able to convince more and more of the scientific community that an investment in this direction won't be squandered. >> It would make no sense for any of these projects to use RDF or even RDFS. > > I wasn't saying anything of the sort. > > If you scroll back and read what I wrote, I said: > > * OWL 2 has annotations of assertions (yay!) Pardon me, I missed the "yay" in your first note :) > * I haven't heard of a single customer who is considering using OWL 2 OWL 2 is currently only in last call, and there are only early implementations. I wouldn't expect the demand to be high yet and we're only beginning to work on education and outreach. But if you do anything in e.g. the biomedical space I would expect there to be upcoming demand. > * I don't know of any widespread deployments of OWL (the implication being > "OWL reasoning", not "OWL vocabulary", which I would hope is obvious). > > All of those things are true, and I'm not impugning OWL. Again, glad to hear that, though I have to say that's not how I read your initial message. > I would very much like to know about high-scale, high-traffic services being > backed by OWL reasoning; knowledge of the industry is very interesting to > me. Terascale reasoning would make some of my areas of interest much more > straightforward! The OWL 2 specification, which I encourage you to read and comment on, includes a number of profiles that are designed for scaling in different directions. You might want to review the OWL2-QL and -RL profiles. QL, in particular, supports implementation on top of relational databases by translation to SQL. Clark and Parsia have an open implementation called OWLGRES. RL is being implemented by ORACLE, and I expect it will be applied to rather large data sets. SHER, from IBM, applies a different strategy for large ABoxes. And the technology is relatively young. Even in the last year there has been significant progress in reasoning algorithms and I expect such work to continue. So while "terascale reasoning" may not be here yet, I'd not rule it out. And there is certainly more to the Semantic Web than this sort of application. There are plenty of applications that have a greater need for correctness and consistency checking than such scale - think medicine (not billing for it), engineering, and law. >> For one thing, the Semantic Web languages are aimed to be a set that work >> together and >> build on each other. OWL will offer the first specified way of doing >> expressive annotations and it would make no sense to do other than use >> the facilities it offers, as owl:sameAs and owl:inverseFunctional are >> used now. > > I will certainly investigate it. The reason I said this was something of a > chicken/egg situation is that I can't see customers porting their *data* to > OWL 2 without having tools to push it around. A language is useless without > speakers. As a provider I would think that part of your role is to be aware of trends in need, and be a provider of solutions. I don't expect the scientists I work with to be experts in data curation or knowledge representation and I regularly push back when they offer solutions they think will work, but which have undesirable properties. I consider my role to help solve their problems, not hope that they figure it out for me. I would be surprised if you didn't have customers with needs for annotations, and I expect that some use of OWL 2 (most likely vocabulary in this case) would be well advised. I'm glad that you will have a closer look at what we're working on in OWL and hope to hear back from you with constructive comments, which I'd appreciate if you could send to public-owl-comments@w3.org. Regards, Alan
Received on Tuesday, 10 February 2009 07:14:27 UTC