- From: alan gebert <alan.gebert@gmail.com>
- Date: Sat, 12 Jan 2008 06:51:49 -0300
- To: semantic-web@w3.org
Skippr: "The RDF Navigation Server" http://code.google.com/p/skippr/ ---- DISCLAIMER: This is a "projected" project ( to be born in a couple of days ). The idea behind this initiative is to "fill a gap" that becomes evident as soon as you start pushing RDF into a user agent. The reason for publishing this early brainstorming is to get feedback and identify possible contributors for the key areas. Please, read along. ---- Just like a skipper commands a vessel navigating across the seas, Skippr commands a user agent's navigation through RDF datasets and the World. It sits between the user agent and the Giant Global Graph, providing cross cutting services that are necessary for a comprehensive browsing experience. By user agent I mean not only HTML browsers, but also mobile and even multimodal agents ( ie voice ). As a "Navigation Server", Skipper will provide the following basic services Sessions Delegated Authentication Pluggable Trust Framework User Interface Engine ( Fresnel ) Schema Inference and IFP Smushing Guided Navigation Engine ( Facets ) Linked Data ( On Demand Scuttering ) Free Text Search Federation ( Sindice ) Provenance SPARQL Endpoint Management and Federation A Skippr server can be deployed in an open web scenario, a closed intranet scenario, or even locally to be consumed by any platform that presents RDF data to human users. In the future, for example, a browser like Mozilla Firefox could be equiped with an integrated Skippr engine and provide a XUL UI on top of it. A cellphone with a small Java ME application showing nearby restaurants from the GGG could go through a skippr server running somewhere on the cloud ( possibly a trusted provider configured by the user ). Skippr will take care of free text searching, smushing and facet extraction to provide the user with a smooth experience, regardless of the device's limitations. The important thing here is not the deployment model but the consolidation of a package that can be used to quickly RDF-enable different user agents in a consistent manner. While the amount of functionality to implement may seem prohibitive at first glance ( yikes! ) these are all problems that *currently* stand in the way between RDF and user agents. If we solve them in a collaborative manner we would bring the following immediate benefits to the community: - Alignment between different working groups - Increase uptake by lowering entry barrier and providing a higher level framework that "makes sense" and "works" - Standardization of vocabularies for basic concepts like agent, session, facet, scutter, free text index, etc. - Standardization or at least general consensus about possible solutions to conflictive use cases, like the use of inference. - Somewhere to plug ACLs and Trust when the time comes. - Avoid an explosion of different Browser behaviours, UI engines, navigations engines, etc. Bottomline, Skippr is an umbrella project that seeks to integrate existing efforts and avoid replication across different RDF user agent teams ( which I presume are growing exponentially ). I hereby make an open invitation to all members of the community that are currently undertaking such projects to join the discussions on how the various topics should be addressed. BTW, I know what some of you algebrains are thinking: "SPARQL Endpoint Management and Federation"??!!! is this guy nuts?. Well, not really. The idea here is not to create a full blown distributed sparql engine, but rather a small 80/20 solution that allows small queries to be distributed over a set of endpoints and a linked data subset of the GGG. Remember that this is a "navigation server" aimed at serving "directed navigational user agents". This means that queries can be restricted in size and complexity. The capacity of a session could be restricted as well. Therefore, *scale* is an important simplifying factor to keep in mind when designing skippr: "Skippr is intended for serving HUMAN navigation only. It cares only about a small portion of the GGG at a time". I created a project page at Google code. No code uploaded yet, as I would like suggestions on package naming because I feel this project is sufficiently important to be community driven right from the beginning. ( Is there any community domain that I could use? org.semanticweb.skippr? ) The first steps will be: - Define a simple data model ( one sparql endpoint for now ) - Integrate a Fresnel Engine that operates on the data and publishes services as RESTful RDF - Work on a generic faceting engine that operates atop a SPARQL endpoint and provides "guided navigation" services. I have no particular predilection for the aforementioned topics, but I happen to need them rather soon. Linked Data would be the next on my list. But you can start working on the other topics if you wish. You will find notes for both these services on the wiki ( @gcode ). Other topics have been seeded as well. Seed code will be up sometime NEXT WEEK. Don't expect much, I am not a full time coder... I like to have a life every now and then ;) I will provide a skeleton with the basics and hope to integrate a fresnel engine and layout the code to begin working on facets. Technicalities: - Java 6 - Sesame 2 Final - Restlet - RESTful RDF ( have you heard of RDF? funny looking stuff... it stands for Really Difficult Format ) I copy the texts below as they stand today in the wiki Sessions While some user agents may have unlimited memory to store data as they browse, others ( like cell phones and PDAs ) may only be able to store a few MBs of data. A browsing session on any subset of the GGG will most probably require significant memory as schema and instance data are downloaded and accumulated. To solve this issue, Skippr should provide the agent with a transient "Session" data space that keeps and smushes data during the span of a browsing session. Delegated Authentication Some SPARQL endpoints will most probably be secured. It makes sense to consider adding an authentication ( SSO style ) feature to Skippr. It makes even more sense to take a look at something like OpenID and see if there is any overlapping. I haven't given this much thought as for now I am either accessing public endpoints or private endpoints behind a firewall. Pluggable Trust Framework While trust on a web-wide level is far from being solved, there will most probably be more than one approach to solve the problem ( trust services, corporate whitelists, etc ). Skippr will therefore only define the contract, and provide a hook for, a trust framework. A generic whitelist/blacklist policy framework should be provided. Of course it will be RDF based... just like everything else in Skippr. User Interface Engine ( Fresnel ) The available fresnel engines ( simile's and jfresnel ) provide only Java APIs and require developers to "glue" them into their browsers. Skippr will expose generic RESTful services that provide clients with different formats ( XML and RDF ) of the different outputs generated by Fersnel pipeline running on top of their live session data. A user agent should be able to ask the query "How am I supposed to present this resource?" And get an RDF or XML response from the server without much thought. Schema Inference and IFP Smushing Skippr should provide different built-in configurations for RDFS and OWL inference. While reasoning is definitely a complex topic, some minor inferences are not only necessary for smushing ( which is critical when handling linked data ) but also for Facet generation and correct UI selection ( specially class-subclass and property-subproperty closures ). Perhaps a backward-chainer would be enough for computing these closures when evaluating FSL expressions or computing hierarchical facets. Guided Navigation Engine ( Facets ) Faceted browsing has proved extremely effective for dealing with mixed and unknown RDF data. Unfortunately it is only available as part of specific browsers operating on closed RDF datasets. The goal of the Skipper/Faceter subproject is to provide a generic Faceter engine operating over the user agent's current session. This should include harvested Linked Data as well as a set of SPARQL endpoints. For now we will limit the engine to a single SPARQL endpoint. The computation of Facets over multiple datasets is something that has to be studied. In fact, it is not only about "adding up" the facets computed on each dataset. It will definitely touch upon distributed querying. ( or not? ) Linked Data ( On Demand ) The "Linking Open Data" initiative protocols are being widely deployed and they hold great promeses for the immediate uptake of the Semantic Web. Skipper should provide a built in and configurable scutter capable of automatically harvesting RDF data as directed navigation happens. Scuttering policies should be configurable BUT we should emphasize some generic convention. Again, delegated auth seems necessary here. Search Federation ( Sindice et al. ) RDF Indexes should provide a generic, discoverable interface in a standardized vocabulary. If this happens, Skippr will understand them and provide integrated free text search to clients. Provenance Skippr should manage and provide provenance data for client introspection. Ideally, it should hide "quads" and only present triples to clients while still allowing them to perform queries like "Who states that foo:bar costs US$5??" as well as retract complete sources. The mythical "Oh Yeah!" button should be somewhere down this road. SPARQL Endpoint Management and Federation Duh, I just implemented a federated sparql engine on a lightweight platform and... it was not fun! Somebody else please. Have a nice Weekend, Al
Received on Saturday, 12 January 2008 09:51:57 UTC