W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > November 2009

Re: SPARQL 1.1 Aggregates

From: Axel Polleres <axel.polleres@deri.org>
Date: Mon, 30 Nov 2009 10:54:15 +0000
Cc: <public-rdf-dawg-comments@w3.org>
Message-Id: <66654619-84C0-468E-9460-CB462FC4A2FD@deri.org>
To: "Toby Inkster" <tai@g5n.co.uk>
Hi Toby,

As for MEAN/MEDIAN, this issue is being discussed, as for the others,
can you give us an indication which current system implements these?
the main rationale with the new features which we try to violate only in exceptional cases is
that we aim to standardise what is implemented by systems "out there". Others can be added by 
means of the standard extensibility mechanisms. 

As for LONGEST/SHORTEST, that seems to be doable with MIN/MAX in combination with 
fn:string-length(), right?

best,
Axel

On 30 Nov 2009, at 09:46, Toby Inkster wrote:

> I see that the built-in set of aggregates for SPARQL 1.1 has not yet
> been decided.
> 
> The current list is quite numerically oriented. Here are some I'd like
> to see:
> 
>         CONCAT - concatenates values, with an optional second
>                 parameter to provide a joiner character. Result
>                 is a plain literal with no language.
> 
>         XML_CONCAT - Concatenates values into an XMLLiteral
>                 using an SPARQL-Results-like structure.
> 
>         LONGEST/SHORTEST - returns the longest or shortest
>                 result (in terms of character count). Optional
>                 second parameter specifies a language.
> 
>         MODE/MEDIAN - while AVG returns the mean result, these
>                 two would return other kinds of average. With
>                 named graphs, the same triple can occur
>                 multiple times, so MODE makes sense. Optional
>                 second parameter specifies a language.
> 
> In the case where I've indicated that the second parameter specifies a
> language, the aggregate function would work like this:
> 
>         1. Do any values in the list match the specified language?
>                 (Using same definition of "match" as langMatches.)
>                 If so, then discard any results which don't match.
> 
>         2. Run the aggregate as normal.
> 
> So for example, on the following graph:
> 
>         <http://example.com/cat>
>                 rdfs:label "cat"@en, "chat"@fr, "feline"@en, "felis"@la.
> 
> This SPARQL query:
> 
>         SELECT ?resource (SHORTEST(?label,"fr") AS ?mylabel)
>         WHERE { ?resource rdfs:label ?label . }
> 
> Would return:
> 
>         resource                 | mylabel
>         -------------------------+-----------
>         <http://example.com/cat> | "chat"@fr
> 
> Because the non-French values would be discarded, with the shortest
> remaining label being selected. However, this:
> 
>         SELECT ?resource (SHORTEST(?label,"de") AS ?mylabel)
>         WHERE { ?resource rdfs:label ?label . }
> 
> Would return
> 
>         resource                 | mylabel
>         -------------------------+-----------
>         <http://example.com/cat> | "cat"@en
> 
> There was no German label in the data, so the discarding step never
> happens - thus the shortest of any language is selected.
> 
> I think in terms of presenting views of graph data, having these
> aggregate language preferences (and they're preferences, not filters, as
> the second example illustrates) would be very useful - especially for
> "label" and "description" kinds of fields.
> 
> While I'm giving examples, I'll provide some for CONCAT and XML_CONCAT:
> 
>         SELECT
>                 ?resource
>                 (CONCAT(?label, ";") AS ?concat)
>                 (XML_CONCAT(?label) AS ?xmlconcat)
>         WHERE { ?resource rdfs:label ?label . }
>         ORDER BY ?label
> 
> ?concat would be "cat;chat;feline;felis" (the ORDER BY clause having
> been used by the aggregate function). ?xmlconcat would be:
> 
> """<literal xml:lang="en">cat</literal>
> <literal xml:lang="fr">chat</literal>
> <literal xml:lang="en">feline</literal>
> <literal xml:lang="la">felis</literal>"""^^rdf:XMLLiteral
> 
> Perhaps the data type could be more specialised - instead of
> rdf:XMLLiteral, it could be, say, sparql:XMLResultsLiteral, which SPARQL
> libraries could recognise and automagically parse for you.
> 
> --
> Toby A Inkster
> <mailto:mail@tobyinkster.co.uk>
> <http://tobyinkster.co.uk>
> 
> 
> 
Received on Monday, 30 November 2009 10:54:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 November 2009 10:54:52 GMT