- From: McBride, Brian <bwm@hplb.hpl.hp.com>
- Date: Wed, 13 Dec 2000 12:03:36 -0000
- To: "RDF Interest (E-mail)" <www-rdf-interest@w3.org>
A number of issues have arisen with the processing of containers by parsers and other RDF processors. It would be a good thing if most (all) parsers handled them the same way. It is appropriate for the RDF Interest Group to discuss the interpretation of the current specification and to document any conclusions that are made. While the Interest Group does not have a charter to revise the specification, the result of this discussion could be offered for use in an errata document and could be provided as input to a future W3C Working Group if one is chartered to update the specification. We would like to get a general consensus among the RDF Interest Group on how parsers should handle containers. To kick things off, we have written a strawman proposal. The following proposal represents the views of the authors and is not an endorsement by the W3C. We invite comment. Brian McBride Dave Beckett ---------------------------------------------------------------- A Proposed Interpretation of RDF Containers =========================================== Draft 1.0 13th December 2000 1. Issue Statement =============== The RDF formal grammar defined in the Model and Syntax Specification [1] is ambiguous. Containers such as rdf:Bag, rdf:Seq and rdf:Alt match the container productions 6.25 through 6.31, but also match the typedNode production (6.13). The container productions attempt to restrict what the language can express about containers, but the ambiguity in the syntax effectively circumvents those restrictions. It is not clear what parsers should do if they encounter an rdf:li element when processing productions other than the container specific productions of the grammar. Sub-classes (described by rdfs:subClassOf) of the container classes do not match the container specific productions in the formal grammar. M&S states that these productions should be extended to included stuctures that are rdfs:subClassOf of rdfs:Container. Processing this requires parsers to process the class structure resources in a document, which some do not do. It also requires that the class structure must be included in any XML serialization. Some of these issues have been raised previously and are recorded in the RDF issues list [3] [4]. We recognise that other issues about containers have been raised whose resolution require changes to the specification. We consider changes to the specification to be beyond the remit of this document, and thus we do not address them here. 2. What Do the Specs Say and Not Say? ================================== o M&S permits the expression of arbitrary structures involving containers, though not always conveniently using the container productions. o M&S says that the rdf:li mechanism is a convenience element [5] to make it easier to write lists of elements without individually numbering them. It does NOT say these only work when processing specific container productions. o M&S says that container members have properties starting at rdf:_1 and running contiguously through to rdf:_n where n is the number of elements in the container. Whilst this is true of the abstract container, M&S does NOT say that an implementation cannot represent a partial model of a container, and thus might not contain all the properties. o M&S and RDF Schema do NOT say that the ordinal properties rdf:_1, rdf:_2, ... can only be applied to containers. 3. Approach of this Proposal ========================= The proposal, in essence, is: o Containers and their sub-classes match the typedNode production (6.13). o Parsers MUST transform rdf:li into an instance of an ordinal property (rdf:_1,rdf: _2 ...) wherever it is used in the formal grammar production propName (6.14). o Ordinal properties may be attached to any resource This proposal does not change the expressive power of the language. Anything that can be expressed with this proposal could have previously been expressed. Anything that could previously have been expressed can be expressed under this proposal. We believe that this proposal merely relaxes some constraints the some parsers have imposed. Thus the triples generated by existing parsers from existing XML serializations of RDF are unlikely to change. To conform to this proposal, some parsers will have to to change. We believe that such changes are simplifications of the parser, as the grammar and processing become more regular. This proposal does not conform to the original intent of the authors of the m&s specification. We believe however, that this proposal is appropriate given implementation experience and new understanding not available to the original authors. 4. Proposal ======== 1) Parsers MAY NOT implement the specific productions 6.25-6.31. This has no effect on the language as anything that matches these productions also matches other productions in the grammar. 2) rdf:li is legal wherever a propName (6.14) production can be used. The rdf:li is transformed into an ordinal (rdf:_n rdf:Property) when it is used, according to the rules in 3 below. 3) rdf:li processing This description of rdf:li processing is described in terms of an implementation. Parsers are not required to implement it this way, but however they implement it, the effect should be the same as if it had been implemented as described here. rdf:li, when it is encountered in the propName (6.14) production, is transformed to an ordinal property, i.e. one of rdf:_1, rdf:_2 etc. It is transformed to the successor of the last ordinal property encountered within the current element. If this is the first ordinal property encountered within the current element, then it is transformed to rdf:_1. The successor of an ordinal property rdf:_n is rdf:_m where m = n+1. Attributes of an element MUST be processed before sub-elements of the element. Sub-elements are processed in the order they appear in the document. The rdf:li processing of sub-elements is independent of the processing of enclosing elements. The selection of an ordinal to replace an rdf:li is not affected by any ordinals encountered in sub-elements of the element. The selection of an ordinal to replace an rdf:li is not affected by ordinals encountered in enclosing elements. Note that XML states that the ordering of attributes is not significant and that the same attribute name cannot appear more than once on an element. It is probably unwise to use rdf:li as an attribute. If it is used in presence of other ordinal property attributes, the ordinal property with which it will be replaced is undefined. 4) rdf:aboutEach processing ------------------------ The rdf:aboutEach attribute defines a distribive referent, as described in section 3.3 of [1]. The rdf:aboutEach referent distributes over all resources R for which is there is a representation of a triple in the XML serialization being processed of the form: [P, rdf:_n, R] where P is the resource identified by the value of the rdf:aboutEach attribute rdf:_n is an ordinal property. 5. Examples ======== Example 1: <rdf:Description rdf:about="http://foo" rdf:li="1"> <rdf:li>2</rdf:li> </rdf:Description> would generate the triples (in subject, predicate, object order): [http://foo, rdf:_1, "1"] [http://foo, rdf:_2, "2"] Example 2: <rdf:Description rdf:about "http://foo"> <rdf:li>1</rdf:li> <rdf:_10>10</rdf:_10> <rdf:li>11</rdf:li> </rdf:Description> would generate the triples: [http://foo, rdf:_1, "1"] [http://foo, rdf:_10, "10"] [http://foo, rdf:_11, "11"] Example 3: <rdf:Description rdf:about "http://foo"> <rdf:li>1</rdf:li> <rdf:_1>1 again</rdf:li> </rdf:Description> would generate: [http://foo, rdf:_1, "1"] [http://foo, rdf:_1, "1 again"] Example 4: <rdf:Description rdf:about="http://badExample" rdf:li="a" rdf:_3="b"/> will generate: [http://badExample, rdf:_n, "a"] [http://badExample, rdf:_3, "b"] where n is some integer greater than 0. Example 5: <rdf:Bag rdf:about="http://foo"> <rdf:li>1</rdf:li> <foo:bar> <rdf:Seq rdf:about="http://bar"> <rdf:li>1</rdf:li> <rdf:li>2</rdf:li> </rdf:Seq> </foo:bar> <rdf:li>2</rdf:li> </rdf:Bag> will generate: [http://foo, rdf:_1, "1"] [http://foo, rdf:_2, "2"] [http://foo, foo:bar, http://bar] [http://bar, rdf:_1, "1"] [http://bar, rdf:_2, "2"] [http://foo, rdf:type, rdf:Seq] [http://bar, rdf:type, rdf:Seq] 6. Unresolved Issues ================= Issue #1 rdf:Seq, rdf:Bag and rdf:Alt optionally take rdf:ID attributes, e.g. sequence (6.25): sequence ::= '<rdf:Seq' idAttr? '>' member* '</rdf:Seq>' | '<rdf:Seq' idAttr? memberAttr* '/>' whereas typedNode (6.13) elements can take further attributes - rdf:about, rdf:aboutEach, rdf:aboutEachPrefix and and rdf:bagID Original rule: typedNode ::= '<' typeName idAboutAttr? bagIdAttr? propAttr* '/>' | '<' typeName idAboutAttr? bagIdAttr? propAttr* '>' propertyElt* '</' typeName '>' so we need to think about this, or at least describe it further. Is this one of those cases where we don't know why this was originally restricted? 7. References ========== [1] http://www.w3.org/TR/REC-rdf-syntax/ [2] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/ [3] http://www.w3.org/2000/03/rdf-tracking/#rdf-containers-syntax-vs-schema [4] http://www.w3.org/2000/03/rdf-tracking/#rdf-containers-otherapproaches [5] http://www.w3.org/TR/REC-rdf-syntax/#containers
Received on Wednesday, 13 December 2000 07:03:48 UTC