W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2008

Re: Another attempt...

From: Andrew Newman <andrewfnewman@gmail.com>
Date: Thu, 20 Mar 2008 07:04:30 +1000
Message-ID: <2db5a5c40803191404o51ea69acmbbe69d1c0008cde0@mail.gmail.com>
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: "Richard Newman" <rnewman@twinql.com>, "Lee Feigenbaum" <lee@thefigtrees.net>, "Arjohn Kampman" <arjohn.kampman@aduna-software.com>, "public-rdf-dawg-comments@w3.org" <public-rdf-dawg-comments@w3.org>

On 19/03/2008, Seaborne, Andy <andy.seaborne@hp.com> wrote:
>  > -----Original Message-----
>  > From: Andrew Newman [mailto:andrewfnewman@gmail.com]
>  > Sent: 18 March 2008 23:19
>  > To: Seaborne, Andy
>  > Cc: Richard Newman; Lee Feigenbaum; Arjohn Kampman; public-rdf-dawg-
>  > comments@w3.org
>  > Subject: Re: Another attempt...
> The Join identity is { { } }, write this as 1
>  The Union identity is { }, write this as 0
>  The analogy with the integers is intended.
>  A Join 1 = A
>  A Union 0 = A
>  A Union 1 != A
>  In the same way for integers:
>  X * 1 = X
>  X + 0 = Z
>  X + 1 != X

Integers don't have the same algebraic properties as boolean
algebra/set alegbra/bags/relations - so I think they make a very poor
basis for SPARQL.  I assume that SPARQL is based on this because it
uses sets, bags and a boolean algebra - if it's not then the SPARQL
document has to be much bigger.

You're saying that SPARQL is basing some of its behaviour on the
algebra of integers.  Why not, instead, be consistent and use one
basis for all of SPARQL's behaviour?

You can derive p OR T = T or A UNION 1 = 1 (denomination law) from the
base axioms from boolean algebra here's one proof:
p OR T = (p OR T)  AND T (identity)
= (p OR T) AND (p OR NOT p) (inverse)
= p OR (T AND NOT p) (distributive)
= p OR NOT p (identity)
= T (inverse).

Hopefully, it's simple enough to see that integers do not behave the
same way boolean values do - you can represent true with 1 and false
with 0 but they're boolean values (binary digits) not integers.

>  Your definition of U given was:
>  """
>  U - universal relation - a relation with no attributes but contains
>  all possible tuples of applicable type.
>  """
>  And you gave some characteristics:
>  A * U = A
>  A + U = U
>  For the integers under * and +, you'd have to extend to include at least one infinity to find a U such that:
>  X + U = U
>  But then
>  X * U = U
>  And it is not the identity of * or +.
>  You want some element that has the characteristic that:
>  For all A. A Join U = U
>  So wouldn't that be the multiset of all possible combinations of variables and values, with a cardinality function of infinity.  (Technically, an extension of SPARQL because the cardinality function of multiset has been changed).
>  With that U
>   A union U = U
>  as well.
>  The Join identity is not the universal relation.

So to explain the universal bag I'll explain what the universal
relation is and then show what's different in the universal bag
(that's about as complete as my understanding is at the moment).

A has a number of attributes a1...an
U has no attributes.
A has a number of tuples a1 = A1, a2 = A2...an = AN.
U has the nullary set of tuples (trivially true) - a cardinality of 1.

In a relational join you set union the attributes, and where a tuple
in A appears in U (always occurs because it is true) then you add that
tuple to the result set.

The only change for bags, AFAICT, is that the universal bag would also
return true for all multiplicity values.

The best explanation for the size or cardinality of the universal and
empty set was explained in one of Date's books to do with DEE and DUM.
 The reasoning was along the lines of how many relations are there
with no attributes - two.  For SPARQL the equivalent is how many
tuples are there without variable bindings.  There is the relation
with no tuples and the relation with 1 tuple (the nullary tuple) - in
SPARQL empty set and empty tuple.  He continues to say that what are
the properties of these relations and why are they important - they're
important because they are identities.  In his latest book about Logic
and Databases he has the same identities for bags and for relations
based along similar lines.

So it would seem that DEE and DUM are much more suitable for
identities as they are based on sets and boolean algebra than the
integers 1 and 0.
Received on Wednesday, 19 March 2008 21:05:07 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:09 UTC