W3C home > Mailing lists > Public > semantic-web@w3.org > February 2011

Re: How to write a UNION in SPARQL.

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 16 Feb 2011 10:16:07 +0000
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <CF5783CA-9A4F-46B1-BBF3-7BA792290E93@garlik.com>
To: Olivier Rossel <datao@datao.net>
Which one is optimal will depend on which RDF store you're using. There is a wide variety of different optimisation algorithms.

A third way to write the query is:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX onto: <http://dbpedia.org/ontology/>
PREFIX yago: <http://dbpedia.org/class/yago/>
SELECT DISTINCT ?uri ?string
WHERE
{
    ?uri rdf:type ?type .
    FILTER(?type = onto:President || ?type = yago:President)
    ?uri onto:birthDate ?date .
    FILTER regex(?date, "^1945") .
    ?uri rdfs:label ?string .
    FILTER (lang(?string) = "en")
}

This third one is likely to be optimal in 4store, and Jena/Joseki for example.

Also, note that the OPTIONAL doesn't actually do anything in your examples below - as you have a FILTER on the value, outside of the OPTIONAL block, any solution where the OPTIONAL doesn't bind will be rejected by the FILTER.

You can write:

OPTIONAL { ?uri rdfs:label ?string . FILTER (lang(?string) = "en") }

Which will only bind @en strings to ?string, but not reject solutions where there is no rdfs:label, if that's what you want.

- Steve

On 2011-02-16, at 09:52, Olivier Rossel wrote:

> On Wed, Feb 16, 2011 at 10:49 AM, Olivier Rossel <datao@datao.net> wrote:
>> In DBPedia, I want to get a list of presidents born in 1945.
>> A president is either a "onto:President" or a "yago:President".
>> So a UNION is needed to manage both.
>> 
>> One way to write the corresponding SPARQL query is :
>> 
>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> PREFIX onto: <http://dbpedia.org/ontology/>
>> PREFIX yago: <http://dbpedia.org/class/yago/>
>> SELECT DISTINCT ?uri ?string
>> WHERE
>> {
>>        {
>>                ?uri rdf:type onto:President .
>>                ?uri onto:birthDate ?date .
>>                FILTER regex(?date, "^1945") .
>>                OPTIONAL {?uri rdfs:label ?string .}
>>                FILTER (lang(?string) = "en")
>>        }
>>        UNION
>>        {
>>                ?uri rdf:type yago:President.
>>                ?uri onto:birthDate ?date .
>>                FILTER regex(?date, "^1945") .
>>                OPTIONAL {?uri rdfs:label ?string .}
>>                FILTER (lang(?string) = "en")
>>        }
>> }
>> 
>> Another way is :
>> 
>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> PREFIX onto: <http://dbpedia.org/ontology/>
>> PREFIX yago: <http://dbpedia.org/class/yago/>
>> SELECT DISTINCT ?uri ?string
>> WHERE
>> {
>>        {{
>>                ?uri rdf:type onto:President .
>> 
>>        }
>>        UNION
>>        {
>>                ?uri rdf:type yago:President.
>> 
>>        }}.
>>      ?uri onto:birthDate ?date .
>>                FILTER regex(?date, "^1945") .
>>                OPTIONAL {?uri rdfs:label ?string .}
>>                FILTER (lang(?string) = "en")
>> }
>> 
> 
> (Oops, I sent the mail a bit too fast :)
> So:
> Which syntax is considered to be optimal?
> Are there some best practices when using UNIONs?
> 
> Any help is welcome.
> 
> Olivier Rossel
> --
> Datao.net
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Wednesday, 16 February 2011 10:16:45 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 21:45:41 GMT