Re: Skippr "The RDF Navigation Server"; Call for Contributors, Subject experts and Comments

On 12/01/2008, alan gebert <alan.gebert@gmail.com> wrote:
>
> Skippr: "The RDF Navigation Server"
> http://code.google.com/p/skippr/
>
> ----
> DISCLAIMER: This is a "projected" project ( to be born in a couple of days ).
> The idea behind this initiative is to "fill a gap" that becomes
> evident as soon as you start pushing RDF into a user agent.
> The reason for publishing this early brainstorming is to get feedback
> and identify possible contributors for the key areas.
> Please, read along.
> ----
>
> Just like a skipper commands a vessel navigating across the seas,
> Skippr commands a user agent's navigation through RDF datasets and the
> World. It sits between the user agent and the Giant Global Graph,
> providing cross cutting services that are necessary for a
> comprehensive browsing experience.
>
> By user agent I mean not only HTML browsers, but also mobile and even
> multimodal agents ( ie voice ).
>
> As a "Navigation Server", Skipper will provide the following basic services
>
> Sessions
> Delegated Authentication
> Pluggable Trust Framework
> User Interface Engine ( Fresnel )
> Schema Inference and IFP Smushing
> Guided Navigation Engine ( Facets )
> Linked Data ( On Demand Scuttering )
> Free Text Search Federation ( Sindice )
> Provenance
> SPARQL Endpoint Management and Federation
>
> A Skippr server can be deployed in an open web scenario, a closed
> intranet scenario, or even locally to be consumed by any platform that
> presents RDF data to human users. In the future, for example, a
> browser like Mozilla Firefox could be equiped with an integrated
> Skippr engine and provide a XUL UI on top of it.
> A cellphone with a small Java ME application showing nearby
> restaurants from the GGG could go through a skippr server running
> somewhere on the cloud ( possibly a trusted provider configured by the
> user ). Skippr will take care of free text searching, smushing and
> facet extraction to provide the user with a smooth experience,
> regardless of the device's limitations.

An XUL application that performed RDF navigation and querying in a
simple manner would be a great step towards showing people these
annotations on their pages will actually be useful!

> The important thing here is not the deployment model but the
> consolidation of a package that can be used to quickly RDF-enable
> different user agents in a consistent manner.
>
> While the amount of functionality to implement may seem prohibitive at
> first glance ( yikes! ) these are all problems that *currently* stand
> in the way between RDF and user agents. If we solve them in a
> collaborative manner we would bring the following immediate benefits
> to the community:
>
> - Alignment between different working groups
> - Increase uptake by lowering entry barrier and providing a higher
> level framework that "makes sense" and "works"
> - Standardization of vocabularies for basic concepts like agent,
> session, facet, scutter, free text index, etc.
> - Standardization or at least general consensus about possible
> solutions to conflictive use cases, like the use of inference.
> - Somewhere to plug ACLs and Trust when the time comes.
> - Avoid an explosion of different Browser behaviours, UI engines,
> navigations engines, etc.
>
> Bottomline, Skippr is an umbrella project that seeks to integrate
> existing efforts and avoid replication across different RDF user agent
> teams ( which I presume are growing exponentially ). I hereby make an
> open invitation to all members of the community that are currently
> undertaking such projects to join the discussions on how the various
> topics should be addressed.
>
> BTW, I know what some of you algebrains are thinking:
>
> "SPARQL Endpoint Management and Federation"??!!!  is this guy nuts?.
>
> Well, not really. The idea here is not to create a full blown
> distributed sparql engine, but rather a small 80/20 solution that
> allows small queries to be distributed over a set of endpoints and a
> linked data subset of the GGG. Remember that this is a "navigation
> server" aimed at serving "directed navigational user agents". This
> means that queries can be restricted in size and complexity. The
> capacity of a session could be restricted as well.

It would be necessary to be able to save sessions of course, but that
is possibly looking too far ahead I guess.

> Therefore, *scale* is an important simplifying factor to keep in mind
> when designing skippr:
>
> "Skippr is intended for serving HUMAN navigation only. It cares only
> about a small portion of the GGG at a time".
>
> I created a project page at Google code. No code uploaded yet, as I
> would like suggestions on package naming because I feel this project
> is sufficiently important to be community driven right from the
> beginning.
> ( Is there any community domain that I could use? org.semanticweb.skippr? )
>
> The first steps will be:
> - Define a simple data model ( one sparql endpoint for now )
> - Integrate a Fresnel Engine that operates on the data and publishes
> services as RESTful RDF
> - Work on a generic faceting engine that operates atop a SPARQL
> endpoint and provides "guided navigation" services.
>
> I have no particular predilection for the aforementioned topics, but I
> happen to need them rather soon. Linked Data would be the next on my
> list. But you can start working on the other topics if you wish.
>
> You will find notes for both these services on the wiki ( @gcode ).
> Other topics have been seeded as well.
>
> Seed code will be up sometime NEXT WEEK. Don't expect much, I am not a
> full time coder... I like to have a life every now and then ;)
> I will provide a skeleton with the basics and hope to integrate a
> fresnel engine and layout the code to begin working on facets.
>
> Technicalities:
> - Java 6
> - Sesame 2 Final
> - Restlet
> - RESTful RDF ( have you heard of RDF?  funny looking stuff... it
> stands for Really Difficult Format )
>
>
>
>
> I copy the texts below as they stand today in the wiki
>
>
> Sessions
>
> While some user agents may have unlimited memory to store data as they
> browse, others ( like cell phones and PDAs ) may only be able to store
> a few MBs of data. A browsing session on any subset of the GGG will
> most probably require significant memory as schema and instance data
> are downloaded and accumulated.
> To solve this issue, Skippr should provide the agent with a transient
> "Session" data space that keeps and smushes data during the span of a
> browsing session.

What do you mean by smushes? I don't anticipate that the user would
actually be looking at the data set so much as looking at just a
representation of it with the possibility of getting to deeper
information when needed. A major problem with generic table's based
RDF browsers is that they provide all the information up front.

> Delegated Authentication
>
> Some SPARQL endpoints will most probably be secured. It makes sense to
> consider adding an authentication ( SSO style ) feature to Skippr. It
> makes even more sense to take a look at something like OpenID and see
> if there is any overlapping. I haven't given this much thought as for
> now I am either accessing public endpoints or private endpoints behind
> a firewall.

http://esoeproject.org/confluence/display/eu/Home

The ESOE project is currently in production use at my university as of
a few months ago. It hooks into OpenID and Shibboleth as well as a few
others currently.

>
> Pluggable Trust Framework
>
> While trust on a web-wide level is far from being solved, there will
> most probably be more than one approach to solve the problem ( trust
> services, corporate whitelists, etc ). Skippr will therefore only
> define the contract, and provide a hook for, a trust framework.
> A generic whitelist/blacklist policy framework should be provided.
> Of course it will be RDF based... just like everything else in Skippr.

Is this Trust at the query level?

> User Interface Engine ( Fresnel )
>
> The available fresnel engines ( simile's and jfresnel ) provide only
> Java APIs and require developers to "glue" them into their browsers.
> Skippr will expose generic RESTful services that provide clients with
> different formats ( XML and RDF ) of the different outputs generated
> by Fersnel pipeline running on top of their live session data.
>
> A user agent should be able to ask the query
> "How am I supposed to present this resource?"
> And get an RDF or XML response from the server without much thought.

What do you mean by "present this resource?"

> Schema Inference and IFP Smushing
>
> Skippr should provide different built-in configurations for RDFS and
> OWL inference.  While reasoning is definitely a complex topic, some
> minor inferences are not only necessary for smushing ( which is
> critical when handling linked data ) but also for Facet generation and
> correct UI selection ( specially class-subclass and
> property-subproperty closures ).
>
> Perhaps a backward-chainer would be enough for computing these
> closures when evaluating FSL expressions or computing hierarchical
> facets.

Are you proposing to have templates for different UI's based on
inference results?

> Guided Navigation Engine ( Facets )
>
> Faceted browsing has proved extremely effective for dealing with mixed
> and unknown RDF data. Unfortunately it is only available as part of
> specific browsers operating on closed RDF datasets.
> The goal of the Skipper/Faceter subproject is to provide a generic
> Faceter engine operating over the user agent's current session. This
> should include harvested Linked Data as well as a set of SPARQL
> endpoints.
>
> For now we will limit the engine to a single SPARQL endpoint. The
> computation of Facets over multiple datasets is something that has to
> be studied. In fact, it is not only about "adding up" the facets
> computed on each dataset. It will definitely touch upon distributed
> querying.
>
> ( or not? )

I haven't found a user yet who wants to browse data using generic
filters. At least not the interfaces I have shown them. It may be a
powerful generic browse mode but it is not at all intuitive.

Are you planning to have an integrated distributed SPARQL query engine
with its inevitable relevant configurations? Or have people found a
non-configuration based way to do distributed querying?

> Linked Data ( On Demand )
>
> The "Linking Open Data" initiative protocols are being widely deployed
> and they hold great promeses for the immediate uptake of the Semantic
> Web. Skipper should provide a built in and configurable scutter
> capable of automatically harvesting RDF data as directed navigation
> happens.
> Scuttering policies should be configurable BUT we should emphasize
> some generic convention.
> Again, delegated auth seems necessary here.

What does scutter mean? A guess from my view is that it means
retrieving documents related to the current view in order to display
rdfs:label or some other property.

> Search Federation ( Sindice et al. )
>
> RDF Indexes should provide a generic, discoverable interface in a
> standardized vocabulary.  If this happens, Skippr will understand them
> and provide integrated free text search to clients.

What do you mean by Index? If it is what I am thinking about then how
is it different from a simple SPARQL query?

> Provenance
>
> Skippr should manage and provide provenance data for client
> introspection.  Ideally, it should hide "quads" and only present
> triples to clients while still allowing them to perform queries like
> "Who states that foo:bar costs US$5??" as well as retract complete
> sources.
>
> The mythical "Oh Yeah!" button should be somewhere down this road.

"Oh Yeah!"?

I don't think it is completely necessary to hide quads, but they
probably shouldn't be shown by default.

> SPARQL Endpoint Management and Federation
>
> Duh, I just implemented a federated sparql engine on a lightweight
> platform and... it was not fun!
> Somebody else please.

By this you mean the user credentials should be used by the SPARQL
engine when it is retrieving resources?

Sounds like a good idea to me.

Peter Ansell

Received on Sunday, 13 January 2008 21:33:33 UTC