- From: Joshua Allen <joshuaa@microsoft.com>
- Date: Wed, 22 May 2002 22:40:52 -0700
- To: "Dan Brickley" <danbri@w3.org>
- Cc: <www-talk@w3.org>, <www-rdf-interest@w3.org>, "Aaron Swartz" <me@aaronsw.com>
> others... there are tools out there. And publication isn't hard. HTTP > servers are plentiful. There are a number of query languages implemented. > Your specific scenarios aren't too far from the Annotea work, and similar > efforts over on www-annotations@w3.org. What do you think stands in the > way of that work going mom'n'dad mainstream? The server-centric model is the thing that stops it going mainstream, I think. I have thought about the particular example of annotea a lot, since identical functionality *has* propagated to Mom'n'dad mainstream in the form of Microsoft's Sharepoint Team Services "discussion server". It is possible to annotate within any HTML page, Word Document, etc. and anyone who is pointed to the same "discussion server" as you can see your annotations, annotate the annotations, and so on. It is fairly old technology, shipping for about 2 years. There are many places that offer discussion server hosting for cheap ($8/month is one I just saw). I have used annotea+Amaya as well, and the functionality is pretty similar. Sharepoint has much higher adoption, but probably this is because Mom'n'dad don't use Amaya, and they *do* use IE. The annotea plugin for IE is a pretty poor UI compared to the Amaya one, IMO, and more "experimental" than anything. So any "normal" user today wanting to annotate documents and share with others is faced with the prospect of switching browsers, trying to install an annotea server, and so on. But like I said, I think it is the server-centric model that limits adoption of *both* STS and Annotea, I am just pointing out that Annotea is intimidating to the average user, and that probably explains why Sharepoint is outpacing Annotea within this limited range of adoption potential. Now, when I say "server-centric", what I mean is that you need to be using the same discussion/annotea server as me if you want to see my annotations. This would be like saying that you have to dialup to my network if you want to read my web pages. Every annotation server becomes an island of metadata. This model only scales so far. And annotea begs the question, what do I gain by using RDF and URIs internally? True, it makes it easy to import/export data between servers, but no easier than doing the same with Sharepoint, and an adapter that converts between sharepoint (or annotea) and any intermediate format is not too hard to write (and most developers would just use XML to do this, not RDF). And in either case, interacting with the annotation server uses some custom protocol that needs to be implemented. So as far as the typical user or sysadmin is concerned, RDF is an implementation detail and doesn't make much real-world difference to the solution. Just to be clear, I am not saying that RDF+URI is *bad*, just that a server-centric model where you have to build bridges between every server manually *anyway* pretty much neutralizes the value of RDF and URIs. The server-centric model is completely contrary to the semantic web, IMO. Annotea raises another question. Adoption of annotea (just like adoption of sharepoint) fragments the world into isolated and disconnected silos of annotation information. We can *claim* that we are doing something good for openness, because "it is theoretically possible one day to make all of the annotea servers share information". But the astute observer will quickly ask, "but isn't that what NNTP has been doing for years, without any XML whatsoever"? So we are pushing annotation servers as a *hypothetical* solution to a problem that has already been solved. So to sum it up: 1) RDF is not necessary (although admittedly it is desirable) for annotations, sharepoint proves that 2) Annotation *Servers* are not necessary or even desirable for annotations, NNTP proves that Finally, I'll address a comment that I suspect will come up, which is "the reason semantic web is having trouble is because jerks like sharepoint refuse to use RDF." The only counter I have for that is that there were plenty of non-HTML hypertext systems when Mosaic was first created, and those non-cooperative "jerks" didn't have to cooperate for the WWW to happen. They competed with WWW as hard as they could, until they finally realized that they were dealing on an entirely different level and cooperation was the only sane thing to do. URIs did not require convincing anyone in the end. They were so self-evidently superior to closed-silo systems that they swept right over the holdouts. By itself, I do not think RDF provides that, any more than HTML would have been able to alone create the WWW. RDF is simply a serialization format for graphs of assertions. To my mind, using RDF in an Annotea server is no different than using HTML in Hypercard. It works, but most people will prefer to use Hypercard's proprietary content language if they can only talk to other hypercards anyway. Saying that RDF allows me to interop with other annotea servers is like saying that HTML lets me cut-and-paste content from Hypercard to Outlook. True, but again it is not compelling enough to make it win out over other formats. > Aggregation is imho the key problem. Most interesting, real world RDF data > is full of blank (URI-less) nodes. Most RDF tools don't provide much by Yes, and I maintain that is the *only* unique problem that the semantic web is solving, and the key to it beating out one-off proprietary solutions. > way of tools to merge these nodes together, so aggregates of RDF data can > be annoyingly fragmented. Fragmentation of data pulls against the network > effect, by lessening the value of exposing and harvesting data into larger I agree, that is why I am so hard-core about using URIs consistently :-) It's also the main thing that concerns me about annotea. Unless we always take the approach that *publishing* metadata is completely independent of aggregating and querying, we have little incentive to make the publishing part be as universal and accessible as possible, and we have little incentive to enable proper aggregation. Saying that *theoretically* we can publish the RDF at an http: address, and *theoretically* we could aggregate that data with other data we yank from an annotea server is not enough -- that is simply HTML. In my opinion, we make the data fragmentation problem *worse*, not better, when we deploy systems in which the publishing and querying are tightly-coupled like that. Now, if we took annotea, and made it be an NNTP-scraper/query engine, and modified the browser plugins to just dump into NNTP, we would be onto something :-)
Received on Thursday, 23 May 2002 02:10:32 UTC