- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Sun, 07 Mar 2010 08:22:55 -0500
- To: Steve Harris <steve.harris@garlik.com>
- CC: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
On 3/5/2010 6:40 AM, Steve Harris wrote: > Hi all, > > Problem: > > There's no way to specify a separator string in the draft GROUP_CONCAT > aggregate. I have a vague memory that we'd discussed this briefly > somewhere, F2F2, or on a call maybe, but it's pretty hazy. This was > brought up in Rob Vesse's recent comment. > > Proposal 1: > > Leave it as it is. Users cannot specify the separator character, it's > fixed in the spec. > > Upside, very simple. Downside, might limit usefulness. > > Probably should make sure there's an escaping function in SPARQL 1.1 > that's compatible with the character. > > Proposal 2: > > If the GROUP_CONCAT expression list has more than one element, then the > lexically last one is removed and used as the separator before being > passed to the Aggregation() algebra function. e.g. GROUP_CONCAT(?x, ?y, > "|") > > Upside, keeps the grammar simple. Downside makes the algebra around > GROUP_CONCAT weird, might be surprising as the multi-expression > behaviour will be different to other aggregates. > > e.g. in GROUP_CONCAT(?x, ?y) ?y will be an argument to the underlying > function, not an expression. Would probably have to pick a value of ?y > to random, a la SAMPLE(), as we don't require that "arguments" to > aggregates are scalar. > > Proposal 3: > > Use MySQL syntax to specify it, i.e. GROUP_CONCAT(?x, ?y SEPARATOR "|"). > > Upside, the same as MySQL (where GROUP_CONCAT comes from), avoids > weirding algebra. Downside, makes the grammar more complex. > > Proposal 4: > > Like 3, but with some other explicit syntax. e.g. GROUP_CONCAT(?x, > ?y)[SEPARATOR "|"] > > Upside, avoids weirding algebra. Downside, we have to think of our own > syntax, no familiarity for MySQL users and probably makes the grammar > more complex. > > --- > > My opinion: > > I'd take 3, or 1 happily, but I think 4 is a bit arbitrary, and 2 is > really nasty. As Andy said, thanks for this, Steve. I agree with your preferences. In Glitter, I implement SEPARATOR as in MySQL (option 3). That said... > There's also other useful syntax around GROUP_CONCAT, e.g. ORDER BY, so > I expect a future SPARQL will end up with something like 3 or 4 anyway. ...if we go with Option 1 now, we'll likely get some complaints from the community, but we'll also give implementers a chance to play with the best approach to this? I think we should go with Option 3 if we feel that consistency with MySQL is valuable. If we don't feel that way, I think we shoudl go with Option 1 AND, in that case, we should consider whether we want to use a name other than GROUP_CONCAT. Lee
Received on Sunday, 7 March 2010 13:23:37 UTC