- From: Steve Harris <steve.harris@garlik.com>
- Date: Sun, 7 Mar 2010 14:46:26 +0000
- To: Lee Feigenbaum <lee@thefigtrees.net>
- Cc: "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
On 7 Mar 2010, at 13:22, Lee Feigenbaum <lee@thefigtrees.net> wrote: > On 3/5/2010 6:40 AM, Steve Harris wrote: >> Hi all, >> >> Problem: >> >> There's no way to specify a separator string in the draft >> GROUP_CONCAT >> aggregate. I have a vague memory that we'd discussed this briefly >> somewhere, F2F2, or on a call maybe, but it's pretty hazy. This was >> brought up in Rob Vesse's recent comment. >> >> Proposal 1: >> >> Leave it as it is. Users cannot specify the separator character, it's >> fixed in the spec. >> >> Upside, very simple. Downside, might limit usefulness. >> >> Probably should make sure there's an escaping function in SPARQL 1.1 >> that's compatible with the character. >> >> Proposal 2: >> >> If the GROUP_CONCAT expression list has more than one element, then >> the >> lexically last one is removed and used as the separator before being >> passed to the Aggregation() algebra function. e.g. GROUP_CONCAT(? >> x, ?y, >> "|") >> >> Upside, keeps the grammar simple. Downside makes the algebra around >> GROUP_CONCAT weird, might be surprising as the multi-expression >> behaviour will be different to other aggregates. >> >> e.g. in GROUP_CONCAT(?x, ?y) ?y will be an argument to the underlying >> function, not an expression. Would probably have to pick a value >> of ?y >> to random, a la SAMPLE(), as we don't require that "arguments" to >> aggregates are scalar. >> >> Proposal 3: >> >> Use MySQL syntax to specify it, i.e. GROUP_CONCAT(?x, ?y SEPARATOR >> "|"). >> >> Upside, the same as MySQL (where GROUP_CONCAT comes from), avoids >> weirding algebra. Downside, makes the grammar more complex. >> >> Proposal 4: >> >> Like 3, but with some other explicit syntax. e.g. GROUP_CONCAT(?x, >> ?y)[SEPARATOR "|"] >> >> Upside, avoids weirding algebra. Downside, we have to think of our >> own >> syntax, no familiarity for MySQL users and probably makes the grammar >> more complex. >> >> --- >> >> My opinion: >> >> I'd take 3, or 1 happily, but I think 4 is a bit arbitrary, and 2 is >> really nasty. > > As Andy said, thanks for this, Steve. > > I agree with your preferences. In Glitter, I implement SEPARATOR as > in MySQL (option 3). That said... > >> There's also other useful syntax around GROUP_CONCAT, e.g. ORDER >> BY, so >> I expect a future SPARQL will end up with something like 3 or 4 >> anyway. > > ...if we go with Option 1 now, we'll likely get some complaints from > the community, but we'll also give implementers a chance to play > with the best approach to this? That can be a bit dangerous, as you can end up with syntacally legal, but semantically different approaches, eg. Option 2 looks like an expression list. I'd hope noone would go with that, but at least Rob was tempted. > I think we should go with Option 3 if we feel that consistency with > MySQL is valuable. If we don't feel that way, I think we shoudl go > with Option 1 AND, in that case, we should consider whether we want > to use a name other than GROUP_CONCAT. Agreed. I think I'd end up implementing mysql style G_C regardless, as it's so familiar to users. - Steve
Received on Sunday, 7 March 2010 14:47:45 UTC