Inline data from Andy Seaborne on 2012-04-28 (public-rdf-dawg@w3.org from April to June 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 28 Apr 2012 11:46:29 +0100
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4F9BCA85.1040700@epimorphics.com>
This is the "BINDINGS anywhere in a graph pattern" feature.

I called it "inline data" to step back from the syntax details.  The 
word "BINDINGS" gets a bit mixed up with "BIND" when they can both be in 
group graph patterns.

 Andy

== Summary

SELECT *
{
     DATA ?x { :x1 :x2 }
     ?x rdfs:label ?label .
}


SELECT *
{
     DATA (?x ?mylabel) {
         (:x1 "X1")
         (:x2 "X2")
     }
     OPTIONAL { ?x rdfs:label ?label }
}

"DATA" being a word that isn't BINDINGS.

== Syntax

The comments and modification of use cases prompted me to check the 
syntax. BINDINGS with one variable is a bit ugly as each term needs a () 
round it.  This really isn't necessary.  I'm guessing that one-variable 
inline data will be a common use cases if we think of it as inline data. 
  The comments suggest this.  (It comes up in my work on the linked data 
API where one query gets some candidates and a second query gets more 
information on each candidate.)

A possibility is:

# Short form - one variable, no () at all.
DATA ?var { <iri1> <iri2> 3 4 }

# Full from.  Consistently (...) around a row header or row data.
DATA (?var1 ?var2) {
   (<iri1> "a")
   (<iri2> "b")
}

Insert your favorite keyword choice here:

TABLE
    it's a bit concrete and tables don't get a mention anywhere else

DATA
    my pref, even though its used in INSERT DATA in SPARQL Update

BINDINGS
    OK but confusion with BIND? Unnecessarily long?

The choices of delimiters is fairly free - the only requirement is an 
explicit end for variables (the "{"), end of data rows (the "}"). Having 
row grouping is very useful for the multi-variable case.

c.f.1.

BINDINGS has an un-delimited list of variables always and the data rows 
must have (...)

BINDINGS ?var { (<iri1>) (<iri2>) (3) (4) }

c.f.2.

FILTER ( ?x IN (<iri1>, <iri2>, 3, 4) )


==== Spec changes

== 10.2 BINDINGS

Rework description and examples.

== Grammar

Grammar: add to list of units in a GroupGraphPattern

[]  GraphPatternNotTriples
    ::=  GroupOrUnionGraphPattern | OptionalGraphPattern |
         MinusGraphPattern | GraphGraphPattern | ServiceGraphPattern |
         Filter | Bind | InlineDataClause

== Algebra

I suggested earlier that it should float to the end of the group, just 
before the FILTERs but that does not work out.  It needs to be joined 
into the group in the location it occurs in (just like a subquery).  It 
is like BIND in that it ends the BGP.

Worked example below.

No new operators - everything is there already.

18.2.4.3 BINDINGS
Move the text from here which turns the BINDINGs syntax into a table to 
just before the pattern translation step (18.2.2.6)

18.2.2.6 Translate Graph Patterns

The algebra transformation step does not have to be changed at all 
because it falls under the catch all

    If E is any other form
         Let A := Translate(E)
         G := Join(G, A)
         End

== Evaluation

No change.  We were already turning BINDINGS into join(..., data table) 
and this is just the same.  BINDINGS didn't have anything special by the 
time evaluation is defined.

== What to do with BINDINGS?

We can leave BINDINGS as it is ("legacy"), rename it to be the same as 
the inline data (if name changes) or remove it.

BINDINGS happens after Grouping/Aggregates, HAVING and before select 
expressions.  It seems to me to be unlikely to see it used with 
group/aggregate - if removed, you'd need a subquery for the group, the a 
join with inline data.  That is, this more specialized case needs more 
syntax.

Caveat the different syntax from DATA.

My preference is to bite the bullet now and remove BINDINGS.  There may 
be complaints, and they are right to complain as we have done 2 LC's, 
but if we are making changes, I think doing it properly for the long 
term is better.

I am also happy for it to be left as-is as legacy.

==== Worked example

---- Data
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://example/> .

:x1 rdfs:label "foo" .
## No :x2
:x3 rdfs:label "foo" .
---- Data

---- Query 1
# Intuitively, start with some possibilities,
# and add rdfs:labels if available.
PREFIX : <http://example/>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

SELECT *
{
     DATA ?x { :x1 :x2 }
     OPTIONAL { ?x rdfs:label ?label }
}

---- Query 1
====>
---------------
| x   | label |
===============
| :x1 | "foo" |
| :x2 |       |
---------------

Join data table with the empty BGP (this is a no-op removed by 
"simplification").

Do an optional (leftjoin) on
    (?x=:x1 ?x=:x2)
    leftjoin ((?x=:x1 ?label="foo), (?x=:x3 ?label="foo"))

so :x2 gets no ?label but is in the answers

---- Query 2
# Intuitively, do some process, restrict output
# by joining with some fixed data.

PREFIX : <http://example/>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

SELECT *
{
     OPTIONAL { ?x rdfs:label ?label }
     DATA ?x { :x1 :x2 }
}
---- Query 2

====>
---------------
| x   | label |
===============
| :x1 | "foo" |
---------------

The OPTIONAL finds :x1 and :x3; the join does not have :x3 in it so only 
:x1 is in the results.

join(?x=:x1 ?x=:x2) with ((?x=:x1 ?label="foo), (?x=:x3 ?label="foo")

not leftjoin.
Received on Saturday, 28 April 2012 10:46:59 UTC