Re: Clarifications needed for the Collection construct (with CR)

>Resending my mail, since CR got lost in the previous mail:-(

Quick response below from Pat Hayes. Others may follow later.

>
>Dear all,
>I see certain things that need to be clarified related to the collection
>construct.
>Sorry for the long mail but I wanted to point it out in detail.
>Thanks for your feedback!
>
>In the syntax specification form 8th November 2002 [1] collections where
>  introduced into the RDF syntax. To create a collection the following new
>terms are included to the RDF namespace: rdf:parseType="Collection",
>rdf:nil, rdf:rest, rdf:first and rdf:List. The collection itself, when
>generated
>with the rdf:parseType="Collection" attribute-value pair, is constructed
>with
>blank nodes of the type rdf:List, which is a rdfs:Class. The blank nodes
>always have a link to the current element of the list connected by the
>property rdf:first, and a link to the rest of the list connected by the
>property
>rdf:rest. The end of the list is denoted by rdf:nil which is an instance of
>the
>class rdf:List, so, rdf:nil itself is a list.
>
>The default way of generating a collection in RDF is to use the attribute-
>value pair rdf:parseType="Collection". But someone could write his own
>constructs.

Not sure what you mean. There are other ways than the use of XML to 
construct an RDF graph, of course.

>As you can read in [2] (chapter 3.2.3) there are currently no
>constraints on collections. Multiple or none rdf:rest or rdf:first
>definitions are
>allowed, which means the following set of triples would also be valid:

It would not be syntactically illegal as an RDF graph, but other 
software might complain about it.

>genID:1  rdf:type  rdf:List .
>genID:1  rdf:first  ex:aaa .
>genID:1  rdf:first  ex:bbb .
>genID:1  rdf:rest  ex:ccc .
>genID:1  rdf:rest  genID:2 .
>genID:2  rdf:type  rdf:List .
>genID:1  rdf:rest  rdf:nil .
>
>The question that arises, does it make any sense?

Yes, it does. If one were to assume (as for example OWL does) that 
rdf:first was a functional (unique-valued) property, then this graph 
would entail that ex:aaa = ex:bbb = ex:ccc (in OWL, this could be 
expressed by owl:sameIndividualAs).

Since neither equality nor functionality can be expressed in RDFS, 
this constraint doesn't amount to much in the RDF model theory; but 
as the spec points out, a semantic extension  (like OWL) may impose 
further conditions on the RDF collection vocabulary.

>What would it mean to
>have a collection element with different values?

They might not be different, see above. The use of different names 
does not entail that the values are different. This is one reason why 
there is little point in imposing 'wellformedness' conditions in RDF 
collections either on the syntax (they would be too strong, or else 
too complicated to be useful) or on the semantics (they would have no 
effect in RDF since they would have no expressible entailments.)

>Would it not make more
>sense to enter a rdf:Bag instead? But there is also another question: Do we
>need the collection construct at all?

It was specifically requested by the Webont working group, as a 
necessary requirement for OWL. So the answer is yes.

>Before there had been three kinds of
>containers, rdf:Bag, rdf:Seq and rdf:Alt.
>There are some differences between containers and a collection. A
>container in RDF is one resource containing all its members. The collection
>is different, there are many resources linked with each other. These
>resources are linked with their value(s) and the end of the collection is
>denoted by the empty list as the object for the rdf:rest property. Now here
>comes the main aim of this new construct: It defines a fixed finite list of
>items with a given length and terminated by rdf:nil, at least this is what
>we
>can read in [4] section 4.2.
>Reaching the goal? There is no restriction on the structure of lists in RDF.
>As shown there can be more than one rdf:rest, more than one rdf:first and
>even the existence of rdf:nil as the terminating object is nowhere forced.

But how could it be forced? RDF graphs cannot have global conditions 
imposed on them by the spec, since they may be formed in real time, 
by rather dumb software which simply collects triples from other 
places and mixes them together. RDF does not undertake to impose any 
global syntactic wellformedness conditions on graphs: the 'largest' 
syntactic unit in RDF is the triple, and a graph is simply a set of 
triples. The intention of the 'list' vocabulary however is that *if* 
the lists are 'well-formed' *then* they denote an appropriate 
sequence of items.

>By default the collection is constructed with blank nodes

No, there is no such default. RDF/XML parsers will do this, but that 
is an XML matter.

>but even this can
>be changed.
>
>Example: A collection with non-blank node.
><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>          xmlns:ex="http://example.org/stuff/1.0/">
>   <rdf:Description rdf:about="http://example.org/basket">
>     <ex:hasFruit rdf:resource="myCollection">
>       <rdf:Description rdf:about="http://example.org/apple"/>
>       <rdf:Description rdf:about="http://example.org/pear"/>
>     </ex:hasFruit>
>   <rdf:List rdf:ID="myCollection">
>         <rdf:first rdf:about="http://example.org/apple"/>
>         <rdf:rest rdf:parseType="Collection">
>            <rdf:Description rdf:about="http://example.org/pear"/>
>         </rdf:rest>
>   </rdf:List>
>   </rdf:Description>
></rdf:RDF>
>
>This example should generate the following triples:
>
><http://example.org/basket>http://example.org/basket  ex:hasFruit 
>ns1:myCollection .
>ns1:myCollection  rdf:type  rdf:List .
>ns1:myCollection  rdf:first  
><http://example.org/apple>http://example.org/apple .
>ns1:myCollection  rdf:rest  genID:1 .
>genID:1  rdf:type  rdf:List .
>genID:1  rdf:first  <http://example.org/pear>http://example.org/pear .
>genID:1  rdf:rest  rdf:nil .
>
>The effect is that by entering a non-blank node someone could enter also
>to the collection construct elements from outside. This means without
>any restrictions this construct is not fixed!

Right, it is not. Nothing is 'fixed' in this sense in RDF. Bear in 
mind - its a centrally important point - that the RDF/XML notation is 
*only* an XML serialization syntax for the RDF graph. Any extra 
structure you might feel is 'natural' in the XML (eg the assumption 
that the listed elements of a container are the full complement of 
the members) is not significant in the RDF if it is not made explicit 
in the RDF graph itself. The relatively 'tight' syntactic form of the 
XML is potentially misleading if this point is not kept in mind.

>What about other relevant RDF constructs? In [4] the following is stated:
>A limitation of the containers is that there is no way to close them, i.e.,
>to
>say, "these are all the members of the container". This is because, while
>one graph may describe some of the members, there is no way to exclude
>the possibility that there is another graph somewhere that describes
>additional members.
>But we can also use blank nodes to identify the rdf:Bag itself. Blank nodes
>#can not be referred from outside and therefore no further member can be
>added.

That is true so long as one only uses a blank node to refer to the 
container. But it is legal, and often useful, to refer to a container 
with a uriref. And in any case, the syntactic limitation is not 
itself a semantic licence to conclude that there are no other items 
in the container. In general, any RDF graph can only be expected to 
be a partial description of the domain being described, and this 
applies to containers as well as everything else.

>It even needs less triples and the graph is more easy to read. The
>example of the fruit basket could be written as:
>
>Example: The fruit basket using the bag construct.
><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>          xmlns:ex="http://example.org/stuff/1.0/">
>   <rdf:Description rdf:about="http://example.org/basket">
>     <ex:hasFruit>
>          <rdf:Bag>
>             <rdf:li rdf:resource="http://example.org/apple"/>
>             <rdf:li rdf:resource="http://example.org/pear"/>
>          </rdf:Bag>
>     </ex:hasFruit>
>   </rdf:Description>
></rdf:RDF>
>
><http://example.org/basket>http://example.org/basket  ex:hasFruit  genID:1 .
>genID:1  rdf:type  rdf:Bag .
>genID:1  rdf:_1  <http://example.org/apple>http://example.org/apple .
>genID:1  rdf:_2  <http://example.org/pear>http://example.org/pear .
>
>Without restrictions on the collection construct it is just a more complex
>way of expressing things we already could express before using containers.

No, it allows you to positively assert that the collection is bounded 
(by the use of rdf:nil), which is impossible with RDF containers.

>Possible restrictions can be:
>1. Each collection in RDF must have exactly one terminating rdf:nil element.
>2. Each collection element must have exactly one connection with the
>rdf:first property.
>3. Each collection element must have exactly one connection with the
>rdf:rest property.
>4. Collection elements in RDF have to be blank nodes.
>
>It might be too restrictive to have all these restrictions

It is too restrictive, in my view, to have any of them as a global 
wellformedness condition on RDF graphs: to do so would require all 
conforming RDF engines to check these conditions every time a graph 
merge is performed.

>and there also
>might
>be further reasons for introducing the collection construct.

The chief reason is that it was formally requested by another WG, so 
I suggest you take up this matter with them 
(http://lists.w3.org/Archives/Public/public-webont-comments/)

>The main difference at the moment is that a container is one resource
>containing
>all values, while the collection contains different linked resources
>containing
>the values. In [1] we can find in the appendix A.3 that the collection
>construct
>was also introduced to support recursive processing in languages such as
>Prolog. There should not be a special construct for each programming
>language.
>
>Additional question:
>What would be the fixed length of a collection? (Number of nodes of type
>rdf:List that are linked (minus rdf:nil nodes), the number of rdf:first
>connections?

The intended meaning is that it would be the number of non-nil nodes 
of type rdf:List.

>What about multi sets in collections?)

Not sure what you mean.

Thanks for your very thorough and detailed comments, by the way.

Best wishes

Pat Hayes


>
>Best Greetings,
>Karsten Tolle
>
>Reference
>[1]       RDF/XML Syntax Specification (Revises) Nov. 8th 2002, online at:
>             
><http://www.w3.org/TR/2002/WD-rdf-syntax-grammar-20021108>http://www.w3.org/TR/2002/WD-rdf-syntax-grammar-20021108
>[2]       RDF Semantics, W3C Working Draft 23 January 2003, online at:
><http://www.w3.org/TR/2003/WD-rdf-mt-20030123/>http://www.w3.org/TR/2003/WD-rdf-mt-20030123/
>[3]       RDF Vocabulary Description Language 1.0: RDF Schema, W3C Working
>Draft 12  November 2002, online at:
><http://www.w3.org/TR/2002/WD-rdf-schema-20021112/>http://www.w3.org/TR/2002/WD-rdf-schema-20021112/
>[4]     RDF Primer, W3C Working Draft 23 January 2003, online at:
><http://www.w3.org/TR/2003/WD-rdf-primer-20030123/>http://www.w3.org/TR/2003/WD-rdf-primer-20030123/
>
>
>___________________________________
>Karsten Tolle


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam

Received on Friday, 21 February 2003 18:43:48 UTC