- From: Orri Erling <erling@xs4all.nl>
- Date: Fri, 1 May 2009 09:53:30 +0200
- To: "'Lee Feigenbaum'" <lee@thefigtrees.net>, "'SPARQL Working Group'" <public-rdf-dawg@w3.org>
All
This is fine by me. This reaches, regardless of ultimately adopted syntax,
the basic goal of SQL parity.
Orri
-----Original Message-----
From: public-rdf-dawg-request@w3.org [mailto:public-rdf-dawg-request@w3.org]
On Behalf Of Lee Feigenbaum
Sent: Friday, May 01, 2009 6:28 AM
To: SPARQL Working Group
Subject: Lee's feature proposal
Hi everyone,
I've posted my proposal for what features the Working Group should work
on on the wiki at:
http://www.w3.org/2009/sparql/wiki/User:Lee_Feigenbaum#Lee.27s_feature_propo
sal
I've copied it in at the end of this note; it contains my reasoning
behind my suggestions.
Regarding Rec. track vs. WG Notes. I do _not_ think that we should
distinguish between these when choosing what features we're going to
work on at this point, and here's why. (I'm not a process expert, this
is purely my understanding.)
1) A document can be developed quite far before the group needs to
decide whether it shoudl be Rec. track or not.
2) WG Notes are first published as First Public Working Drafts, at which
point they carry the same IP requirements and exclusion opportunities as
any document intended for Rec. track - in light of our need to
re-charter with a specific list of deliverables, then, we should include
in our decision any material we plan to work on, whether it ends up a
Note or a Rec. track document.
3) I don't think we should view Notes as things to be churned out
quickly and without much consideration or review. Rather, I think Notes
are best used for material that may not be core to the language or
protocol, that may represent a best or common practice (as in the JSON
results format), or that is a common but difficult-to-implement
extension such that the group feels that a Note would document
interoperable semantics without requiring multiple implementations to
move through the Rec. track.
Please note that I wrote some of this proposal before some of the more
recent survey responses. I've taken a look at those responses, which
don't change the meat of my proposal (I made one small change to a
prioritization), but note that the exact numbers for some of my
reasoning (when I refer to the survey results) may no longer be fully
accurate.
Lee
Lee's feature proposal
The following is a proposal for the features that the SPARQL WG should
adopt. It is an attempt to reach consensus by balancing previously
stated goals including
* group preference
* group energy
* implementation experience
* utility to developers
* utility to end users
* extensibility
* conservatism
[edit] Constraints
On April 28 the group resolved to accept aggregates, subqueries, and
update as deliverables.
[edit] Proposal
[edit] Required Features
* Aggregate functions
* Subqueries
* Update
* Project expressions
* Service description
[edit] Time-permitting Features
(Roughly in this order.)
* SPARQL/OWL
* Property paths
* Function library
* Basic federated query
* Surface syntax
[edit] Commentary
This proposal has 5 mandatory features and 5 time/energy-permitting
features. This is more than I think is desireable, but I have a hard
time making the proposal narrower.
The required features consist of the three features identified early on
as having the highest level of consensus.
I've also included as required project expressions, the ability to
include arbitrary expressions in a SELECT clause. The aggregate feature
already requires the group to find a way to include values not
explicitly mentioned in the RDF dataset in a query's results (i.e. the
computed value of aggregate functions), and it seems confusing and
unnecessarily limiting to not also allow the same or a similar
(syntactic) mechanism to be allowed to introduce new scalar values into
query result sets. In addition, project expressions in conjunction with
the othe required features enables the same capabilities as various
other proposed features, including assignment and scalar expressions in
construct. Project expressions receives significant but not overwhelming
WG support in our survey, with five organizations ranking it amongst
their top four features, and no organizations explicitly objecting to
it. Project expressions is widely implemented in existing SPARQL engines.
Finally, I suggest that service description be a required deliverable of
the Working Group. While there are various design pieces to draw on,
service description carries the challenge of the Working Group doing a
fair bit of design work. However, I believe that this sort of
leading-edge-of-the-curve design work is appropriate for the SPARQL WG
in the case of a feature such as service description that is an
extensibility point and an enabler for future standardization efforts.
Service description provides a standard way for extended SPARQL
implementations to advertise their capabilities, and in doing so
encourages similar implementations to coalesce around common syntax and
semantics of extensions. It can be used to advertise entailment regimes,
extended surface syntax, data set information (including optimization
hints for federation), supported functions, and much more. Service
description received moderate WG support in the survey (5 organizations
including it in their top 10), and no organizations explicitly objected
to it. With Condorcet, service description is preferred to everything
except the top 3 features and negation. (See below for more on negation.)
I've included five time-permitting features in this proposal, ranked
roughly in the order in which I believe the group should pursue them. I
acknowledge at the same time that some of these efforts can reasonably
go on in parallel with either other time-permitting features or in
parallel with development of required features.
I believe that SPARQL/OWL is an important deliverable for this WG. The
SPARQL community sees somewhat of a divide between those using SPARQL
purely to query RDF graphs, and those using SPARQL in conjunction with
richer semantics. The original SPARQL effort acknowledged this by
providing a mechanism to define extensions that would define basic graph
pattern matching for entailment regimes other than simple entailment.
This extension mechanism is key to enabling groups other than the SPARQL
working group (whether formal or informal groups) to define how SPARQL
queries behave in the presence of other semantic regimes. But the
extension mechanism has never been formally tested, and it seems to be
prudent to test it (a) under the auspices of the SPARQL WG, so that the
results may feed back into the SPARQL BGP extension specification itself
and (b) in the context of OWL semantics, probably the most popular
richer entailment regime that currently exists. There are numerous
implementations that implement SPARQL/OWL already, though likely not in
an interoperable fashion. And in the personage of Bijan Parsia, the
SPARQL WG has the expertise and energy necessary to properly specify the
SPARQL/OWL basic graph pattern matching extension. SPARQL/OWL received
minimal support in the survey, but seemed to have a somewhat warmer
reception in the discussion on the April 28 teleconference.
I believe that property paths is an important deliverable for the WG as
it enables variable-length path queries for SPARQL developers. It has
significant support within the WG, and it also enables most cases of the
accessing RDF lists proposed feature.
I believe that Surface syntax and Function library represent reasonable
maintenance tasks for the WG to examine, time-permitting. Accepting
surface syntax as a time-permitting feature gives the WG an opportunity
to examine capabilities of the SPARQL language that are particularly
onerous to use and to consider specialized syntax for these features.
Accepting function library allows the WG to consider extending the core
set of functions available when moving between SPARQL implementations to
include things like basic string or mathematic operations.
Finally, I believe the WG should deliver a specification for basic
federated query, time-permitting. Federated query is implemented in a
variety of forms in several implementations, and the feature received
significant support in the survey (6 organizations including it amongst
their top six choices). I believe that looking at a design for basic
federated query is important for the growing Linked Data community, and
the time is ripe to standardize on basic federated query as a way to
encourage implementations to explore more and more sophisticated
approaches to federated query.
This proposal leaves out many good features, and I'd be remiss not to
address several specific ones.
* Negation. The survey indicated strong support for providing a
simpler form of asking negative queries than the current OPTIONAL/!bound
construct. I've excluded this from my proposal under the hope that the
design for subqueries may obviate the need for this feature.
* Full text. The survey indicated strong support for standardizing
the syntax and semantics for full text search in SPARQL. While I believe
that this is one of the top interoperability stumbling blocks for
SPARQL, the wide-open design space (both for syntax and semantics) of
the problem worries me.
* Parameterized inference. The survey indicated support from a
small number of organizations for parameterized inference. The
discussion during the April 28 teleconference made clear to me that some
members of the WG see a need both to define what it means to query other
entailment regimes (a la SPARQL/OWL) and also how to go about doing that
on a query-by-query basis. The latter is what parameterized inference is
about. I have omitted parameterized inference from my proposal because
of the lack of existing implementations/designs to draw on, coupled with
the fact that service descriptions provide an out-of-band way for
endpoints to indicate the entailment regime or rulesets that they
service. I recognize that this does not fully address the use case of
on-demand rulesets, but I believe that this would be better served via a
SPARQL protocol feature, and I do not see any mature designs yet in this
space to draw upon. I believe that (1) standardizing on the semantics of
SPARQL/OWL and (2) the increasing maturity and deployment of RIF, will
encourage SPARQL implementations to begin to explore this space more and
make this an appropriate feature for a future round of standardization.
Received on Friday, 1 May 2009 07:55:08 UTC