Set semantics for the stem graph (was: Re: Relationship between EricP's default mapping and Datalog rules approach?) from Richard Cyganiak on 2010-07-21 (public-rdb2rdf-wg@w3.org from July 2010)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 21 Jul 2010 16:44:24 +0100
To: Eric Prud'hommeaux <eric@w3.org>
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-Id: <FD6745C2-882F-4C83-BAD2-63AEEC603CFC@cyganiak.de>

Eric,

On 18 Jul 2010, at 22:31, Eric Prud'hommeaux wrote:
> I eventually picked set semantics because of the success of "Semantics
> and Complexity of SPARQL" Pérez, Arenas, and Gutierrez
>  http://arxiv.org/pdf/cs.DB/0605124
>
> This is a good opportunity for me to proof-read and provide an English
> reading, using the definitions in the Notation section:

Hm... If the goal is to be so formal and precise that even a machine  
can understand it, then using a well-defined machine-interpretable  
formalism is perhaps best. If the goal is merely to be so formal and  
precise that a human reader can understand the intended behaviour of  
an implementation, then much less formalism and notational precision  
is required, and bits of prose could be used to avoid a lot of  
notation. In other words, writing for a human audience is not the same  
as writing for a compiler. Your proposal reads as if it was written  
for a compiler.

I had a go at writing a definition of the direct mapping (stem graph  
only) that's supposed to be first of all human-readable. It's a sketch  
and doesn't handle FKs or any of the extensions, and glosses over  
datatypes and probably some other things.

I personally, as an implementer, would like to read something in this  
general style in the spec as the definition of the direct mapping for  
the stem graph. I would prefer this over reading your set semantics  
notation, and over reading Scala code, and over reading Datalog, and  
over reading RIF.

Best,
Richard

---------

A relational database DB is a mapping from relation names to relations.

Relations have attributes, a primary key, and tuples.
Let attrs(R) be the set of attributes in relation R.
Let pk(R) be the primary key of relation R. The primary key is a list  
of attributes.
Let tuples(R) be the set of tuples in relation R.

A tuple t is a mapping from attributes to attribute values.
Let domain(t) be its set of attributes.
All tuples t in relation R have domain(t)=attrs(R).
We write t(x) for the value of attribute x in tuple t.
Values have a datatype, dtype(value).

Given a stem URI stemURI, we define:

stemGraph(DB) = union of all stemGraphR(relname, R) where R is a  
relation named relname in DB.

stemGraphR(relname, R) = union of all stemGraphT(relname, pk(R),  
tuple) where tuple is in tuples(R).

stemGraphT(relname, pk, tuple) = set of all RDF triples <s, p, o> where
      attr an attribute in domain(tuple),
      s = tupleURI(relname, pk, tuple),
      p = attrURI(relname, attr),
      o = rdfValue(tuple(attr))

tupleURI(relname, pk, tuple) =
      concat(stemURI, '/', ecape(relname), '/', escape(tuple(pk1)),  
'/', .., '/', escape(tuple(pkn)))
      where pk = <pk1..pkn>

attrURI(relname, attr) = concat(stemURI, '/', escape(relname), '.',  
escape(attr))

rdfValue(value) = an RDF literal of type xsd:this if dtype(value) is  
THIS
rdfValue(value) = an RDF literal of type xsd:that if dtype(value) is  
THAT
etc etc

Received on Wednesday, 21 July 2010 15:44:59 UTC