Blank nodes scope in CONSTRUCT queries from Miguel on 2011-10-11 (public-sparql-dev@w3.org from October to December 2011)

From: Miguel <miguel.ceriani@gmail.com>
Date: Tue, 11 Oct 2011 13:03:23 +0200
To: public-sparql-dev@w3.org
Message-ID: <CALWU=RtmKXP0GD4O6f6Dk0bX75bXNm61h6NoMtOgjsxyr6e3Fw@mail.gmail.com>

Hello,
I have been thinking on a possible extension of SPARQL CONSTRUCT queries.
Both in SPARQL Recommendation and in the 1.1 Working Draft the scope
of blank nodes in CONSTRUCT queries is defined as follows:

"A template can create an RDF graph containing blank nodes. The blank
node labels are scoped to the template for each solution. If the same
label occurs twice in a template, then there will be one blank node
created for each query solution, but there will be different blank
nodes for triples generated by different query solutions."

I modified the example in the specification to show why I propose to
increase the flexibility of the CONSTRUCT form

CONSTRUCT { ?x vcard:N _:v .
            _:v vcard:givenName ?gname .
            _:v vcard:familyName ?fname .
            _:v vcard:email ?email }
WHERE
 {
    { ?x foaf:firstname ?gname } UNION  { ?x foaf:givenname   ?gname } .
    { ?x foaf:surname   ?fname } UNION  { ?x foaf:family_name ?fname } .
    ?x foaf:mbox ?email .
 }

But if some person has more than one email address the query gives
duplicated vcards for them.

This is a general problem, there is often need to group results in
similar ways. In SQL or SPARQL SELECT this can be done using an ORDER
BY to group together related tuples and "breaking" in the host system
by a key value change.

A similar and more general result could be reached in CONSTRUCT
queries if we allow to explicitly declare the scope of each blank node
in the template.

As a syntax example I add a (optional) SCOPE clause to the CONSTRUCT queries:

CONSTRUCT
  {
    ?x vcard:N _:v .
    _:v vcard:givenName ?gname .
    _:v vcard:familyName ?fname .
    _:v vcard:email ?email
  }
SCOPE
  {
    _:v (?x)
  }
WHERE
 {
    { ?x foaf:firstname ?gname } UNION  { ?x foaf:givenname   ?gname } .
    { ?x foaf:surname   ?fname } UNION  { ?x foaf:family_name ?fname } .
    ?x foaf:mbox ?email .
 }

meaning that for _:v a different blank node is created for each
different ?x binding (and not for each ?x-?gname-?fname-?email
binding).
This is a simple example there is a single blank node linked to a
single variable, but we could have several blank nodes, each one
linked to a different subset of variables.
For a blank node not in SCOPE, the default scoping is as in the
existing definition (it depends from all the variables)

Please let me know what you think of this proposal and excuse for my
poor English.

Thanks,
Miguel

Received on Tuesday, 11 October 2011 18:35:56 UTC