- From: Toby Inkster <tai@g5n.co.uk>
- Date: Mon, 30 Nov 2009 09:46:24 +0000
- To: public-rdf-dawg-comments@w3.org
I see that the built-in set of aggregates for SPARQL 1.1 has not yet been decided. The current list is quite numerically oriented. Here are some I'd like to see: CONCAT - concatenates values, with an optional second parameter to provide a joiner character. Result is a plain literal with no language. XML_CONCAT - Concatenates values into an XMLLiteral using an SPARQL-Results-like structure. LONGEST/SHORTEST - returns the longest or shortest result (in terms of character count). Optional second parameter specifies a language. MODE/MEDIAN - while AVG returns the mean result, these two would return other kinds of average. With named graphs, the same triple can occur multiple times, so MODE makes sense. Optional second parameter specifies a language. In the case where I've indicated that the second parameter specifies a language, the aggregate function would work like this: 1. Do any values in the list match the specified language? (Using same definition of "match" as langMatches.) If so, then discard any results which don't match. 2. Run the aggregate as normal. So for example, on the following graph: <http://example.com/cat> rdfs:label "cat"@en, "chat"@fr, "feline"@en, "felis"@la. This SPARQL query: SELECT ?resource (SHORTEST(?label,"fr") AS ?mylabel) WHERE { ?resource rdfs:label ?label . } Would return: resource | mylabel -------------------------+----------- <http://example.com/cat> | "chat"@fr Because the non-French values would be discarded, with the shortest remaining label being selected. However, this: SELECT ?resource (SHORTEST(?label,"de") AS ?mylabel) WHERE { ?resource rdfs:label ?label . } Would return resource | mylabel -------------------------+----------- <http://example.com/cat> | "cat"@en There was no German label in the data, so the discarding step never happens - thus the shortest of any language is selected. I think in terms of presenting views of graph data, having these aggregate language preferences (and they're preferences, not filters, as the second example illustrates) would be very useful - especially for "label" and "description" kinds of fields. While I'm giving examples, I'll provide some for CONCAT and XML_CONCAT: SELECT ?resource (CONCAT(?label, ";") AS ?concat) (XML_CONCAT(?label) AS ?xmlconcat) WHERE { ?resource rdfs:label ?label . } ORDER BY ?label ?concat would be "cat;chat;feline;felis" (the ORDER BY clause having been used by the aggregate function). ?xmlconcat would be: """<literal xml:lang="en">cat</literal> <literal xml:lang="fr">chat</literal> <literal xml:lang="en">feline</literal> <literal xml:lang="la">felis</literal>"""^^rdf:XMLLiteral Perhaps the data type could be more specialised - instead of rdf:XMLLiteral, it could be, say, sparql:XMLResultsLiteral, which SPARQL libraries could recognise and automagically parse for you. -- Toby A Inkster <mailto:mail@tobyinkster.co.uk> <http://tobyinkster.co.uk>
Received on Monday, 30 November 2009 09:47:14 UTC