- From: Graham Klyne <GK@ninebynine.org>
- Date: Wed, 13 Aug 2003 10:14:57 +0100
- To: Garret Wilson <garret@globalmentor.com>, www-rdf-interest@w3.org
One of the reasons for adding parsetype=collection was precisely because some people needed a way to express a "closed" list; i.e. one which could not be made into another (well-formed) list simply by the addition of extra triples to a graph. So in many respects I would see that which you perceive as "problems" to be "features". #g -- At 17:09 12/08/03 -0700, Garret Wilson wrote: >Everyone, > >I'm just getting around to implementing support for >rdf:parseType="Collection", even though I've been using it for a while in >specifications I've written. > >The design of rdf:List looks good in theory, but there are a few details >that make it a pain to implement---particularly, the way rdf#nil is used: > >1. Empty lists cannot be modified. > >In the RDF world, it's natural to think that one can always add things to >a resource. I can specify another dc:creator to a book. I can later add an >rdfs:label to a resource. Yes, I know that RDF is designed to only be >aware of one set of static relationships, but in the real world we need to >modify resources at some time or another. We can modify rdf:Alt, rdf:Bag, >and rdf:Seq, even if they are empty---we just add one or more resources to >the collection. > >rdf:List is designed so that one cannot add anything to an empty rdf:List, >because there is only one empty rdf:List, the one named rdf#nil. There are >an infinite number of non-empty lists, with an infinite number of (usually >anonymous/blank node) reference URIs, but *none* of those lists can be >empty, because the empty list has its own static universal reference URI. >Conversely, the empty list can never be added to without changing its >reference URI. Put differently, an empty rdf:List cannot be filled---it >can only be replaced with a filled list. (Compare this to an rdf:List with >one element---programmatically, one can add more elements by changing the >rdf:rest property, allowing the list remains the same entity, identified >by the same reference URI. This is impossible with an empty list.) > >2. It is a pain to populate an rdf:List. > >Even if we accept the theoretical notion of RDF as a static snapshot of >relationships, in the real world one has to populate that directed graph >programmatically---when parsing an RDF+XML document, for instance. With >the old containers, that was easy: we start with an empty rdf:Alt, >rdf:Bag, or rdf:Seq and then add elements if and only if they are present. > >With rdf:List, this procedure remains the same *only* after we know we >have one element in a list. Until we have one element in the list, we >don't know whether to create an anonymous rdf:List and populate it with >items, or (if there are no items) to create an rdf#nil list (with its >unique reference URI). This results in very inelegant algorithms: > >while(there are child elements) >{ > create a new rdf:List > if(we've already created an rdf:List) > { > add the new rdf:List to the old one > } > else > { > specify that the new list is the "root" list > save this new list for next time > } >} >if(we have record of finding a "root" list) > use the "root" list as the property value > >In contrast, the old containers allowed very elegant implementations, >because they didn't distinguish conceptually between empty and filled >containers: > >create new container >while(there are child elements) >{ > add the element to the container >} >use the new container as the property value > >3. It is impossible to independently insert an element at the beginning of >an rdf:List. > >In object-oriented programming, I'd like to have an object represent an >rdf:List. In Java, something implementing java.util.List would be great. >Given any MyList, it's a simple matter to insert something at index i+1 >with i>0: I just create a new rdf:List with rdf:first representing the >inserted resource, change rdf:List(i).rdf:rest to point to the new >rdf:List, and change rdf:List(i+1).rdf:rest to the old value of >rdf:List(i).rdf:rest. > >That's all fine except when i=0. To insert at the front of the list, I >have to know for which resource property the rdf:List is the property >value. This is made worse by the fact that several properties (of several >resources) might have the rdf:List as a property value. This leads to the >following inconsistency: if resources example.com#book1, >example.com#book2, and example.com#book3 all have a property of >listOfComments, I can always add another comment to the end of the list >without modifying the property value for any of the books, but if I want >to *insert* a comment at the first of the list, I have to modify the >property value for each of the books. > >In very practical terms, that means if I have the function... > >add(RDFList list, RDFResource resourceElement, int index) > >...it will work for all values of index except index==0, unless I have >access to the entire RDF data model, walk the graph, and find all >resources which have properties for which the list is a property value. > >Similarly, going back to problem #1 (above), the function... > >add(RDFList list, RDFResource resourceElement, int index) > >...cannot work with empty lists! > >I understand that the old collection framework had shortcomings (those >silly indexes, for one thing) and that the new rdf:List framework looks >nice in a pretty static graph on paper. The specific way that rdf#nil is >used as an empty list, however, creates very inelegant impelementation >restrictions. Surely rdf:List could me modified to be better than the old >collections, yet also usable in real life. > >Garret --------------------------------- Graham Klyne <GK@NineByNine.net> Nine by Nine http://www.ninebynine.net/
Received on Wednesday, 13 August 2003 05:27:56 UTC