RE: Reification of Sets (of RDF Statement, for Queries) from Danny Ayers on 2001-04-10 (www-rdf-interest@w3.org from April 2001)

From: Danny Ayers <danny@panlanka.net>
Date: Tue, 10 Apr 2001 11:22:45 +0600
To: "Sandro Hawke" <sandro@w3.org>, <www-rdf-interest@w3.org>
Message-ID: <EBEPLGMHCDOJJJPCFHEFIEEPDBAA.danny@panlanka.net>
<- Issue 1: RDF M&S Does Not Provide Sets
<-
<- The argument against them I've heard is "we don't have an enforcement
<- mechanism" (for duplicates, so use bags) or "we have to provide them
<- in some order" (so use lists).
<-
<- I think those arguments against defining a vocabulary for
<- communicating information about set membership are, to put it mildly,
<- weak.
<-
<- On the first point, you don't need to provide an enforcement
<- mechanism.  If someone says "X contains 3" and then "X contains 3"
<- again, well, you know "X contains 3".  No problem.

IANAL, but don't we still need to be able to tell the difference between
e.g. {1, 2, 3, 4} and {1, 3, 2, 3, 3, 3, 4, 3}?

<- On the second: it doesn't matter if you have extraneous data.  In set
<- theory, people say "x={3,4}" and they know it's the same as "x={4,3}".
<- Yes, syntacticly the elements appear as a list, but the set whose
<- elements were enumerated by the list is the name no matter what order
<- of enumeration is used.  The extraneous ordering information is simply
<- ignored.

Sounds reasonable to me, but one of the popular arguments about 'why RDF is
broke' is that without core standardisation (the discussion has been around
representing negation, quantification etc, but I think I'm ok to generalise)
then different systems will interpret data in different ways. Which is it to
be - extraneous information considered safe? or considered harmful?


<- Issue 2: Completeness of Knowledge  ("Closed" collections)
<-
<- There is a significant difference between "The set X contains the
<- numbers 3 and 4" and "The set X contains *only* the numbers 3 and 4."
<- Given only the information in the first form, you cannot answer
<- whether 5 is in X.
<-
<- RDF collections at present only provide incomplete knowledge, so
<- people have to frame their queries as being about a different set.
<- "Is 5 in X?" cannot ever be answered negatively, so you have to ask
<- "Is 5 in the set of things you currently know to be in X?"  I think
<- this is broken.

I'm not sure about this - I can't see anything wrong with asking "Is 5 in
the set of things you currently know to be in X?", in fact it sounds
preferable to me - dealing with metadata (what you know about X) rather data
(what is in X).


<-    4) LISP-style lists: predicates First and Rest, object TheEmptyList.
<-
<-        - a little complicated

There is definitely elegance to LISP-style lists, though personally I can't
judge how appropriate they would be in this context - the RDF model is
pretty much object-orientated (maybe transformed a little) - how well does
the mix of lists & objects work in e.g. CLOS?

I certainly don't think this approach would be any more complicated than
using _1, _2 etc - it may just be the syntax, but that looks  incredibly
ugly to me.

<-    5) ...?  anything else?
<-
<- It's tempting to think in terms of the syntax where it looks like the
<- list is complete:
<-
<- <list id=foo>
<-   <li>a
<-   <li>b
<- </list>
<-
<- but in the abstract systax (think of the graph) that closing
<- information is lost as the RDF parsers seem to handle it.  (As is the
<- ordering, if you don't turn the "li" predicate into "_1"...)

The parser looks like a red herring there - we know that

foo contains a
foo contains b

though we have the same problem you discussed earlier in asking about foo
containing c

if we get our information from the infoset then we have the order, if we get
it from the XML syntax we don't

<- For making Sets from Lists:
<-
<-    1)  an Emumeration predicate, relating a set to a list which
<-        contains all the same elements at least once
<-
<-    2)  an ElementSet predicate, the inverse of Enumeration
<-
<-    3)  ...?  anything else?

I don't think this kind of thing is such a problem - if it can be expressed
in one form, then the others can be derived (I'm thinking like enumeration
<-> iteration <-> recursion).

<- To bring this together in an example, I'm trying to represent RDF
<- queries in RDF (ie to reify them).  I think the right approach looks
<- in n3 like:
<-    :myQuery q:statements { ... bunch of statements ... };
<-             q:variables ( ... list of terms in the statements
<- 	                      which are variables ...).
<- but the conversion of the "bunch of statements" and "list of terms"
<- into proper RDF Sentences is subject to a resolution to the issues
<- raised here.  My current vote is for (4) lisp-lists and (1) a set
<- enumeration predicate.   If anyone has any objection to these, I'd be
<- interested in hearing it.

Personally I'd give a vote for these, but the state of my chad would depend
on hearing from someone who has spent a lot of time juggling together lists
and objects.

-99
Received on Tuesday, 10 April 2001 01:26:17 UTC