- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Fri, 01 May 2009 00:27:54 -0400
- To: SPARQL Working Group <public-rdf-dawg@w3.org>
Hi everyone, I've posted my proposal for what features the Working Group should work on on the wiki at: http://www.w3.org/2009/sparql/wiki/User:Lee_Feigenbaum#Lee.27s_feature_proposal I've copied it in at the end of this note; it contains my reasoning behind my suggestions. Regarding Rec. track vs. WG Notes. I do _not_ think that we should distinguish between these when choosing what features we're going to work on at this point, and here's why. (I'm not a process expert, this is purely my understanding.) 1) A document can be developed quite far before the group needs to decide whether it shoudl be Rec. track or not. 2) WG Notes are first published as First Public Working Drafts, at which point they carry the same IP requirements and exclusion opportunities as any document intended for Rec. track - in light of our need to re-charter with a specific list of deliverables, then, we should include in our decision any material we plan to work on, whether it ends up a Note or a Rec. track document. 3) I don't think we should view Notes as things to be churned out quickly and without much consideration or review. Rather, I think Notes are best used for material that may not be core to the language or protocol, that may represent a best or common practice (as in the JSON results format), or that is a common but difficult-to-implement extension such that the group feels that a Note would document interoperable semantics without requiring multiple implementations to move through the Rec. track. Please note that I wrote some of this proposal before some of the more recent survey responses. I've taken a look at those responses, which don't change the meat of my proposal (I made one small change to a prioritization), but note that the exact numbers for some of my reasoning (when I refer to the survey results) may no longer be fully accurate. Lee Lee's feature proposal The following is a proposal for the features that the SPARQL WG should adopt. It is an attempt to reach consensus by balancing previously stated goals including * group preference * group energy * implementation experience * utility to developers * utility to end users * extensibility * conservatism [edit] Constraints On April 28 the group resolved to accept aggregates, subqueries, and update as deliverables. [edit] Proposal [edit] Required Features * Aggregate functions * Subqueries * Update * Project expressions * Service description [edit] Time-permitting Features (Roughly in this order.) * SPARQL/OWL * Property paths * Function library * Basic federated query * Surface syntax [edit] Commentary This proposal has 5 mandatory features and 5 time/energy-permitting features. This is more than I think is desireable, but I have a hard time making the proposal narrower. The required features consist of the three features identified early on as having the highest level of consensus. I've also included as required project expressions, the ability to include arbitrary expressions in a SELECT clause. The aggregate feature already requires the group to find a way to include values not explicitly mentioned in the RDF dataset in a query's results (i.e. the computed value of aggregate functions), and it seems confusing and unnecessarily limiting to not also allow the same or a similar (syntactic) mechanism to be allowed to introduce new scalar values into query result sets. In addition, project expressions in conjunction with the othe required features enables the same capabilities as various other proposed features, including assignment and scalar expressions in construct. Project expressions receives significant but not overwhelming WG support in our survey, with five organizations ranking it amongst their top four features, and no organizations explicitly objecting to it. Project expressions is widely implemented in existing SPARQL engines. Finally, I suggest that service description be a required deliverable of the Working Group. While there are various design pieces to draw on, service description carries the challenge of the Working Group doing a fair bit of design work. However, I believe that this sort of leading-edge-of-the-curve design work is appropriate for the SPARQL WG in the case of a feature such as service description that is an extensibility point and an enabler for future standardization efforts. Service description provides a standard way for extended SPARQL implementations to advertise their capabilities, and in doing so encourages similar implementations to coalesce around common syntax and semantics of extensions. It can be used to advertise entailment regimes, extended surface syntax, data set information (including optimization hints for federation), supported functions, and much more. Service description received moderate WG support in the survey (5 organizations including it in their top 10), and no organizations explicitly objected to it. With Condorcet, service description is preferred to everything except the top 3 features and negation. (See below for more on negation.) I've included five time-permitting features in this proposal, ranked roughly in the order in which I believe the group should pursue them. I acknowledge at the same time that some of these efforts can reasonably go on in parallel with either other time-permitting features or in parallel with development of required features. I believe that SPARQL/OWL is an important deliverable for this WG. The SPARQL community sees somewhat of a divide between those using SPARQL purely to query RDF graphs, and those using SPARQL in conjunction with richer semantics. The original SPARQL effort acknowledged this by providing a mechanism to define extensions that would define basic graph pattern matching for entailment regimes other than simple entailment. This extension mechanism is key to enabling groups other than the SPARQL working group (whether formal or informal groups) to define how SPARQL queries behave in the presence of other semantic regimes. But the extension mechanism has never been formally tested, and it seems to be prudent to test it (a) under the auspices of the SPARQL WG, so that the results may feed back into the SPARQL BGP extension specification itself and (b) in the context of OWL semantics, probably the most popular richer entailment regime that currently exists. There are numerous implementations that implement SPARQL/OWL already, though likely not in an interoperable fashion. And in the personage of Bijan Parsia, the SPARQL WG has the expertise and energy necessary to properly specify the SPARQL/OWL basic graph pattern matching extension. SPARQL/OWL received minimal support in the survey, but seemed to have a somewhat warmer reception in the discussion on the April 28 teleconference. I believe that property paths is an important deliverable for the WG as it enables variable-length path queries for SPARQL developers. It has significant support within the WG, and it also enables most cases of the accessing RDF lists proposed feature. I believe that Surface syntax and Function library represent reasonable maintenance tasks for the WG to examine, time-permitting. Accepting surface syntax as a time-permitting feature gives the WG an opportunity to examine capabilities of the SPARQL language that are particularly onerous to use and to consider specialized syntax for these features. Accepting function library allows the WG to consider extending the core set of functions available when moving between SPARQL implementations to include things like basic string or mathematic operations. Finally, I believe the WG should deliver a specification for basic federated query, time-permitting. Federated query is implemented in a variety of forms in several implementations, and the feature received significant support in the survey (6 organizations including it amongst their top six choices). I believe that looking at a design for basic federated query is important for the growing Linked Data community, and the time is ripe to standardize on basic federated query as a way to encourage implementations to explore more and more sophisticated approaches to federated query. This proposal leaves out many good features, and I'd be remiss not to address several specific ones. * Negation. The survey indicated strong support for providing a simpler form of asking negative queries than the current OPTIONAL/!bound construct. I've excluded this from my proposal under the hope that the design for subqueries may obviate the need for this feature. * Full text. The survey indicated strong support for standardizing the syntax and semantics for full text search in SPARQL. While I believe that this is one of the top interoperability stumbling blocks for SPARQL, the wide-open design space (both for syntax and semantics) of the problem worries me. * Parameterized inference. The survey indicated support from a small number of organizations for parameterized inference. The discussion during the April 28 teleconference made clear to me that some members of the WG see a need both to define what it means to query other entailment regimes (a la SPARQL/OWL) and also how to go about doing that on a query-by-query basis. The latter is what parameterized inference is about. I have omitted parameterized inference from my proposal because of the lack of existing implementations/designs to draw on, coupled with the fact that service descriptions provide an out-of-band way for endpoints to indicate the entailment regime or rulesets that they service. I recognize that this does not fully address the use case of on-demand rulesets, but I believe that this would be better served via a SPARQL protocol feature, and I do not see any mature designs yet in this space to draw upon. I believe that (1) standardizing on the semantics of SPARQL/OWL and (2) the increasing maturity and deployment of RIF, will encourage SPARQL implementations to begin to explore this space more and make this an appropriate feature for a future round of standardization.
Received on Friday, 1 May 2009 04:28:36 UTC