RE: Reification of Sets (of RDF Statement, for Queries) from Peter F. Patel-Schneider on 2001-04-10 (www-rdf-interest@w3.org from April 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Tue, 10 Apr 2001 12:12:22 -0400
To: danny@panlanka.net
Cc: sandro@w3.org, www-rdf-interest@w3.org
Message-Id: <20010410121222M.pfps@research.bell-labs.com>

From: "Danny Ayers" <danny@panlanka.net>
Subject: RE: Reification of Sets (of RDF Statement, for Queries)
Date: Tue, 10 Apr 2001 11:22:45 +0600

> 
> <- Issue 1: RDF M&S Does Not Provide Sets
> <-
> <- The argument against them I've heard is "we don't have an enforcement
> <- mechanism" (for duplicates, so use bags) or "we have to provide them
> <- in some order" (so use lists).
> <-
> <- I think those arguments against defining a vocabulary for
> <- communicating information about set membership are, to put it mildly,
> <- weak.
> <-
> <- On the first point, you don't need to provide an enforcement
> <- mechanism.  If someone says "X contains 3" and then "X contains 3"
> <- again, well, you know "X contains 3".  No problem.
> 
> IANAL, but don't we still need to be able to tell the difference between
> e.g. {1, 2, 3, 4} and {1, 3, 2, 3, 3, 3, 4, 3}?

There is no difference.  That is the whole point of sets, {3,4}, {4,3}, and
{3,4,4} are all precisely the same.  There is no way of distinguishing
between them.  If there was a way, then they would not be (standard) sets.

The situation should be exactly the same in RDF for bags.  The bag with
elements 3,4 and the bag with elements 4,3 should be exactly the same.  If
there was a way in RDF to distinguish between these two things in RDF, then
they would not be bags.  (Yes, I know that you can distinguish---thus RDF
does not really have bags.)

> <- On the second: it doesn't matter if you have extraneous data.  In set
> <- theory, people say "x={3,4}" and they know it's the same as "x={4,3}".
> <- Yes, syntacticly the elements appear as a list, but the set whose
> <- elements were enumerated by the list is the name no matter what order
> <- of enumeration is used.  The extraneous ordering information is simply
> <- ignored.
> 
> Sounds reasonable to me, but one of the popular arguments about 'why RDF is
> broke' is that without core standardisation (the discussion has been around
> representing negation, quantification etc, but I think I'm ok to generalise)
> then different systems will interpret data in different ways. Which is it to
> be - extraneous information considered safe? or considered harmful?

The problem is that there is a mismatch between the description (set or
bag) and the reality (ordered collections).  Some will argue that the order
must always be ignored, some will argue that the order can be used.  

Peter F. Patel-Schneider
Bell Labs Research

Received on Tuesday, 10 April 2001 12:14:06 UTC