Re: Parsing and Containers

Brian,

I think your proposal is a fine attempt at making some sense of RDFM&S, and 
can work just fine when dealing with a single RDF document.

Unfortunately, I think the underlying container concepts in RDFM&S are 
broken, for reasons I've stated elsewhere.  (Specifically, the use of 
<rdf:li> syntax and ordinal property names:  multiple documents cannot be 
parsed separately and the resulting graphs combined into a single graph 
without ugly special processing.)  I think the mechanisms fail one of RDF's 
primary goals:  scalability in an open-world environment.

My suggestion would be more radical (heretical even):  strike the container 
concepts from a revised version of RDFM&S, and then define just the 
simplest possible RDF elements needed to support other parts of that 
spec.  My rationale is that basic RDF is sufficient to describe the 
container concepts without making special extensions to support them.

(C.f. <http://www.w3.org/DesignIssues/Principles.html>).

#g
--


At 12:03 PM 12/13/00 +0000, McBride, Brian wrote:
>A number of issues have arisen with the processing of
>containers by parsers and other RDF processors.  It
>would be a good thing if most (all) parsers handled
>them the same way.
>
>It is appropriate for the RDF Interest Group to discuss
>the interpretation of the current specification and to
>document any conclusions that are made.  While the
>Interest Group does not have a charter to revise the
>specification, the result of this discussion could be
>offered for use in an errata document and could
>be provided as input to a future W3C Working Group if
>one is chartered to update the specification.
>
>We would like to get a general consensus among the
>RDF Interest Group on how parsers should handle containers.
>To kick things off, we have written a strawman proposal.
>
>The following proposal represents the views of the authors
>and is not an endorsement by the W3C.  We invite comment.
>
>Brian McBride
>Dave Beckett
>
>----------------------------------------------------------------
>
>A Proposed Interpretation of RDF Containers
>===========================================
>
>Draft 1.0
>13th December 2000
>
>
>1. Issue Statement
>    ===============
>
>The RDF formal grammar defined in the Model and Syntax
>Specification [1] is ambiguous.  Containers such as
>rdf:Bag, rdf:Seq and rdf:Alt match the container
>productions 6.25 through 6.31, but also match the
>typedNode production (6.13).
>
>The container productions attempt to restrict what the
>language can express about containers, but the ambiguity
>in the syntax effectively circumvents those restrictions.
>
>It is not clear what parsers should do if they encounter
>an rdf:li element when processing productions other than
>the container specific productions of the grammar.
>
>Sub-classes (described by rdfs:subClassOf) of the
>container classes do not match the container specific
>productions in the formal grammar.  M&S states that these
>productions should be extended to included stuctures that
>are rdfs:subClassOf of rdfs:Container.  Processing this
>requires parsers to process the class structure resources
>in a document, which some do not do.  It also requires that
>the class structure must be included in any XML
>serialization.
>
>Some of these issues have been raised previously and are
>recorded in the RDF issues list [3] [4].
>
>We recognise that other issues about containers have been
>raised whose resolution require changes to the
>specification.  We consider changes to the specification
>to be beyond the remit of this document, and thus we do
>not address them here.
>
>2. What Do the Specs Say and Not Say?
>    ==================================
>
>  o M&S permits the expression of arbitrary structures
>    involving containers, though not always conveniently
>    using the container productions.
>
>  o M&S says that the rdf:li mechanism is a convenience
>    element [5] to make it easier to write lists of elements
>    without individually numbering them.  It does NOT
>    say these only work when processing specific
>    container productions.
>
>  o M&S says that container members have properties
>    starting at rdf:_1 and running contiguously through to
>    rdf:_n where n is the number of elements in the container.
>    Whilst this is true of the abstract container, M&S
>    does NOT say that an implementation cannot represent
>    a partial model of a container, and thus might not
>    contain all the properties.
>
>   o M&S and RDF Schema do NOT say that the ordinal
>     properties rdf:_1, rdf:_2, ... can only be applied to
>     containers.
>
>3. Approach of this Proposal
>    =========================
>
>The proposal, in essence, is:
>
>   o Containers and their sub-classes match the typedNode
>     production (6.13).
>
>   o Parsers MUST transform rdf:li into an instance of an
>     ordinal property (rdf:_1,rdf: _2 ...) wherever it
>     is used in the formal grammar production propName
>    (6.14).
>
>   o Ordinal properties may be attached to any resource
>
>This proposal does not change the expressive power of the
>language.  Anything that can be expressed with this
>proposal could have previously been expressed.  Anything
>that could previously have been expressed can be
>expressed under this proposal.
>
>We believe that this proposal merely relaxes some
>constraints the some parsers have imposed.  Thus the
>triples generated by existing parsers from existing
>XML serializations of RDF are unlikely to change.
>
>To conform to this proposal, some parsers will have to
>to change.  We believe that such changes are
>simplifications of the parser, as the grammar and
>processing become more regular.
>
>This proposal does not conform to the original intent
>of the authors of the m&s specification.  We believe
>however, that this proposal is appropriate given
>implementation experience and new understanding not
>available to the original authors.
>
>4. Proposal
>    ========
>
>  1) Parsers MAY NOT implement the specific productions
>     6.25-6.31.  This has no effect on the language as
>     anything that matches these productions also matches
>     other productions in the grammar.
>
>  2) rdf:li is legal wherever a propName (6.14)
>     production can be used.  The rdf:li is transformed
>     into an ordinal (rdf:_n rdf:Property) when it is used,
>     according to the rules in 3 below.
>
>  3) rdf:li processing
>
>     This description of rdf:li processing is described in
>     terms of an implementation.  Parsers are not required
>     to implement it this way, but however they implement
>     it, the effect should be the same as if it had been
>     implemented as described here.
>
>     rdf:li, when it is encountered in the propName (6.14)
>     production, is transformed to an ordinal property, i.e.
>     one of rdf:_1, rdf:_2 etc.
>
>     It is transformed to the successor of the last ordinal
>     property encountered within the current element.  If
>     this is the first ordinal property encountered within
>     the current element, then it is transformed to rdf:_1.
>     The successor of an ordinal property rdf:_n is rdf:_m
>     where m = n+1.
>
>     Attributes of an element MUST be processed before
>     sub-elements of the element.  Sub-elements are processed
>     in the order they appear in the document.
>
>     The rdf:li processing of sub-elements is independent
>     of the processing of enclosing elements.  The selection
>     of an ordinal to replace an rdf:li is not affected by
>     any ordinals encountered in sub-elements of the element.
>     The selection of an ordinal to replace an rdf:li is
>     not affected by ordinals encountered in enclosing
>     elements.
>
>     Note that XML states that the ordering of attributes is
>     not significant and that the same attribute name cannot
>     appear more than once on an element.  It is probably
>     unwise to use rdf:li as an attribute.  If it is used in
>     presence of other ordinal property attributes, the ordinal
>     property with which it will be replaced is undefined.
>
>  4) rdf:aboutEach processing
>     ------------------------
>
>     The rdf:aboutEach attribute defines a distribive
>     referent, as described in section 3.3 of [1].
>
>     The rdf:aboutEach referent distributes over
>     all resources R for which is there is a representation
>     of a triple in the XML serialization being processed
>     of the form:
>         [P, rdf:_n, R]
>       where
>         P is the resource identified by the value of the
>           rdf:aboutEach attribute
>         rdf:_n is an ordinal property.
>
>5. Examples
>    ========
>
>    Example 1:
>
>       <rdf:Description rdf:about="http://foo" rdf:li="1">
>         <rdf:li>2</rdf:li>
>       </rdf:Description>
>
>    would generate the triples (in subject, predicate, object order):
>
>       [http://foo, rdf:_1, "1"]
>       [http://foo, rdf:_2, "2"]
>
>    Example 2:
>
>       <rdf:Description rdf:about "http://foo">
>         <rdf:li>1</rdf:li>
>         <rdf:_10>10</rdf:_10>
>         <rdf:li>11</rdf:li>
>       </rdf:Description>
>
>     would generate the triples:
>
>       [http://foo, rdf:_1, "1"]
>       [http://foo, rdf:_10, "10"]
>       [http://foo, rdf:_11, "11"]
>
>    Example 3:
>
>       <rdf:Description rdf:about "http://foo">
>         <rdf:li>1</rdf:li>
>         <rdf:_1>1 again</rdf:li>
>       </rdf:Description>
>
>     would generate:
>
>       [http://foo, rdf:_1, "1"]
>       [http://foo, rdf:_1, "1 again"]
>
>   Example 4:
>
>      <rdf:Description rdf:about="http://badExample" rdf:li="a" rdf:_3="b"/>
>
>      will generate:
>
>         [http://badExample, rdf:_n, "a"]
>         [http://badExample, rdf:_3, "b"]
>
>       where n is some integer greater than 0.
>
>   Example 5:
>
>       <rdf:Bag rdf:about="http://foo">
>         <rdf:li>1</rdf:li>
>           <foo:bar>
>             <rdf:Seq rdf:about="http://bar">
>               <rdf:li>1</rdf:li>
>               <rdf:li>2</rdf:li>
>             </rdf:Seq>
>           </foo:bar>
>         <rdf:li>2</rdf:li>
>       </rdf:Bag>
>
>       will generate:
>
>         [http://foo, rdf:_1, "1"]
>         [http://foo, rdf:_2, "2"]
>         [http://foo, foo:bar, http://bar]
>         [http://bar, rdf:_1, "1"]
>         [http://bar, rdf:_2, "2"]
>         [http://foo, rdf:type, rdf:Seq]
>         [http://bar, rdf:type, rdf:Seq]
>
>6. Unresolved Issues
>    =================
>
>Issue #1
>
>rdf:Seq, rdf:Bag and rdf:Alt optionally take rdf:ID
>attributes, e.g. sequence (6.25):
>
>   sequence ::= '<rdf:Seq' idAttr? '>' member* '</rdf:Seq>' |
>                '<rdf:Seq' idAttr? memberAttr* '/>'
>
>whereas typedNode (6.13) elements can take further attributes -
>rdf:about, rdf:aboutEach, rdf:aboutEachPrefix and and rdf:bagID
>
>Original rule:
>   typedNode ::= '<' typeName idAboutAttr? bagIdAttr? propAttr* '/>' |
>                 '<' typeName idAboutAttr? bagIdAttr? propAttr* '>'
>propertyElt* '</' typeName '>'
>
>so we need to think about this, or at least describe it further.  Is
>this one of those cases where we don't know why this was originally
>restricted?
>
>7. References
>    ==========
>
>[1] http://www.w3.org/TR/REC-rdf-syntax/
>[2] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/
>[3] http://www.w3.org/2000/03/rdf-tracking/#rdf-containers-syntax-vs-schema
>[4] http://www.w3.org/2000/03/rdf-tracking/#rdf-containers-otherapproaches
>[5] http://www.w3.org/TR/REC-rdf-syntax/#containers

------------
Graham Klyne
(GK@ACM.ORG)

Received on Monday, 1 January 2001 14:00:49 UTC