RDF/XML Syntax - Triple Production

This document gives a formal description of which graph is produced from a given XML serialization of RDF, as described in RDF M&S after amendments and clarifications (some from the RDF Core WG; some yet to be discussed).

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document.

This is the author's personal viewpoint and is intended for review and discussion by the RDF Core Working Group particularly the syntax subgroup.

Other interested parties are discouraged from giving feedback and comments at this time.

1 Triple Production Objectives

Much of Model & Syntax consists of descriptions in English of which triples in an RDF graph correspond to the various syntactic constructs.

This text has been found to be problematic.

It is difficult to tell whether or not it is self-contradictory.

It is difficult to understand precisely which triples correspond to some of the more obscure RDF syntactic constructions.

Here, we propose a more formal treatment of triple production.

A set of rules is proposed that formally specify which triples correspond to a particular legal RDF/XML document.

The intent is not to provide a template for implementation, although the rules can be used that way, but more to provide clarity about the triples that any RDF/XML parser should be producing.

The framework used is intended to be open to rigorous scrutiny. In particular, it is possible to show:

Whether the rules are self-contradictory, giving rise to alternative sets of triples depending on details not given in the rule set.
Whether the rules are sufficient to construct terminating programs.
Whether the rules cover all syntactic legal RDF/XML documents.
Whether the output of the rules is a set of triples.

The intended output format is N-Triples. In this draft, the final trivial mapping from XML to N-Triples is omitted.

This description does not attempt to give a further account of the RDF/XML syntax: it assumes well-formed input that conforms with the grammar given in Refactoring RDF.

We start with an informal account of where the triples come from. This informal account is tied into the specific rules proposed at the b eginning of the rules section.

The current document is incomplete and does not deal with:

rdf:parseType="Literal"
the final resolution of distributed subjects
attributes from the XML namespace
many outstanding syntactic issues on the RDF Core WG's issue list.
the final mapping from an XML form to N-Triple.

Moreover, the rules have not been tested or debugged, and hence should be expected to be simply wrong.

2 RDF/XML Syntax and Triples

This section is intended as an informal introduction to a way of viewing triple processing.

The model of triple processing is intended to be easy to follow, referencing back to a much simplified striped RDF/XML syntax which is a subset of the full RDF/XML syntax. Each of the more advanced pieces of RDF/XML syntax is seen as an extension of the simplified striped syntax.

The rules later on, reduce all RDF/XML documents to a form that is even simpler than the simplified striped RDF/XML.

2.1 RDF/XML Striped Syntax

We consider the subset of RDF/XML documents with only rdf:Description nodes optional with an rdf:about attribute and simple property elements containing only a string or rdf:Description value. This is given by the following RelaxNG schema.

namespace local = ""
namespace rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"

start = RDF
RDF = element rdf:RDF { description* }
description = element rdf:Description { aboutAttr?, propertyElt* }
propertyElt = element * - local:* {
                             description
                            | string?
                          }
aboutAttr = attribute rdf:about { URI-reference }
URI-reference = string

In such documents the graph syntax is found by mapping:

each description into a vertex in the graph possibly labelled with a URI
each propertyElt into an edge in the graph.

We describe this mapping by describing the edges as triples, in which the vertices in the graph which are not labelled with a URI are referenced using a locally scoped name (bNodeLabel).
descriptions which have an aboutAttr map to vertices labelled with the URI-reference value of the aboutAttr. Other descriptions map to unlabelled vertices, and are assigned a locally unique bNodeLabel for the purpose of describing the triples.
Each propertyElt with a description content maps to a triple given by:

the mapping of its parent element (either a URI or bNodeLabel)
the URI corresponding to the tag of the propertyElt itself
the mapping of its child description element (either a URI or bNodeLabel)

Each propertyElt with string or empty content maps to a triple given by:

the mapping of its parent element (either a URI or bNodeLabel)
the URI corresponding to the tag of the propertyElt itself
the string-value (or the empty string)

2.2 RDF/XML Advanced Syntax

There are the following aspects to the rest of RDF/XML syntax:

abbreviations

typed nodes, property attributes, omitted description elements,
rdf:ID as an alternative to rdf:about
collection membership counting

reification
bagID

distributed subjects (rdf:aboutEach)

The abbreviations and collection membership counting can be seen as occuring prior to reification, bagID processing. aboutEach resolution can be seen as coming after all other processing.
The abbreviations have no significance other than compactness.
The collection membership provides an incremental counter.
Reification uses four triples to model a triple added by a propertyEltor an abbreviation.
A bagID constructs a collection of reifications of all the triples added by propertyElts or abbreviations that are child nodes of a description.
Distributed subjects can be used to avoid certain repetitious parts in the RDF/XML. Distributed subjects are processed by separately collecting all the triples generated with subject corresponding to an element with an rdf:aboutEach attribute. Only after all other processing is completed are these distributed subject triples joined with the other triples in the graph.

2.2.1 Striping

In the advanced syntax it is necessary to distinguish the different stripes (typedNodes from propertyElts) in the RDF/XML document. This is made harder by the rdf:parseType="Resource" propertyEltproduction.
It is easier to detect mistaken use of rdf:li as a typedNode if striping is resolved before collection membership counting.

2.2.2 Abbreviations

FIXME: blow-by-blow account of each abbreviation:

typedNode: This corresponds to an rdf:Description node with an additional property element child, with tag rdf:type.
rdf:ID: This corresponds to an rdf:about attribute with a relative URI given by a fragment identifier.
propAttr: This corresponds to an additional property element child with a string content.
rdf:type propAttr: Exceptional the property attribute rdf:type corresponds to a property element child with a further rdf:Description node as its content.
rdf:parseType="Resource": This is a short hand for missing out an unadorned rdf:Description node, whose property element children are explicitly listed.
propertyElt with propAttrs, rdf:resource or rdf:bagID: This is a short hand for missing out an empty rdf:Description node with propAttrs and rdf:bagID as given on the propertyElt, and an rdf:about with value given by the rdf:resource attribute of the propertyElt.

In this case, the primary triple of the property element is that corresponds to the unadorned property element, and not any of those corresponding to the propAttrs or the bagID.

These abbreviations can be done in any order, in particular triples arising from typedNodes and propAttrs are not ordered.

2.2.3 Collection Membership

rdf:li may be used as the tag on a propertyElt.
Each such propertyElt is equivalent to one with a tag of rdf:_{1+count(preceding-sibling::rdf:li)} FIXME say it in English.
This step must be done before any other that needs to:

make use of the name of the predicate of a triple (e.g. reification)
or, treat this propertyElt independently from its preceding-siblings.

2.2.4 Reification and bagID

The analysis concerning reification is very difficult to separate from that for bagID.

BagID

A bagID attribute on a typedNode or a description element signals the reification of all triples arising from:

the typedNode construction
property attributes
and the primary triple from each property element child.

The primary triples of the propertyElt children may already be being explicitly reified with an rdf:ID attribute. In such cases, the bagID does not cause a second reification, but refers to the labelled reification.
The rdf:bagID='bID' attribute signals the creation of the following triple:

<#bID> <rdf:type> <rdf:Bag> .

and a triple

<#bID> <rdf:_NNN> <#reifyID> .

for each of the primary triples from property element children that have an explicit reification (with rdf:ID='reifyID'); and a triple

<#bID> <rdf:_NNN> _:bNodeLabel .

for each of the other triples identified above, where bNodeLabelis the local label for the node of the graph being the reification of the triple. The rdf:_NNN are the properties rdf:_1, rdf:_2 etc, sequentially starting from 1. No correspondence is specified between the order rdf:_1, rdf:_2 etc and any other order. In particular, it is not the case that the two generated triples

<#bID> rdf:_1 _:Statement1 .
<#bID> rdf:_2 _:Statement2 .

imply that the statement reified as Statement1 occurs earlier in the XML document than that reified as Statement2.

Reification

Reification results in four triples as given in rule reification string or rule reification resource.

This applies equally whether reification is explicitly triggered through an rdf:ID attribute or implicitly triggered through an rdf:bagID attribute.

2.2.5 Distributed Subjects

TBD

3 Triple Production Overview

The definition of triple production is done in terms of an abstract processing model. It is not the intention that this processing model is mandatory, or even desirable, for RDF/XML parsers. It is none-the-less possible to construct an RDF/XML parser using this processing model.

Each step in the processing model involves an input document and an output document. The output document is formed from the input document using some document transformation. The minimal steps are restricted to small transformations, which are either conceptually simple, or which are given by the rules in section 4. The major steps in the processing model are:

Turning all implicit attributes from the XML namespace into explicit attributes on all the elements within scope.
Repeatedly applying any of the simplification rules of section 4, that each match some aspect of the RDF/XML abbreviated syntax, and expand that aspect out in a more explicit XML representation of the same triples.
When no more simplification rules apply, the distributed subjects are resolved, with file scope. At this stage the XML file consists of a series of top-level elements each corresponding to one triple, possibly with a distributed subject. The input to this transformation is a simplified RDF/XML document with distributed subjects, the output is a simplified RDF/XML document without distributed subjects.
The simplified RDF/XML document has precisely one triple per top-level element. The transformation of this into N-Triples is the final step.

The simplification rules are at the heart of this process. Each rule has a left hand side which is a template for an XML element which may match a top-level element in the input document, if it does, then an application of the rule replaces that top-level element with the right hand side of the rule, and leaves the rest of the document unchanged. Typically, the left hand side matches some aspect of the abbreviated syntax, and the right hand side explains what the abbreviation means. Variables are used to copy parts of the matched element into its replacement. As in XSLT or Prolog the variables are assign-once. Each rule hence specifies a tree transformation, and the ruleset specifies an XML tree transforming grammar.

No ordering is specified between the rules. An implementation is free to apply the rules in any order.

The attributes used within the rules include various extension attributes that are not legal within an RDF/XML document; and MUST NOT be used in one. These attributes act to allow expressing partially simplified RDF/XML documents; such as one with a half-expanded container. They also allow the explicit reference to bNodes within an intermediate RDF/XML document. All these attributes come from the rdfx namespace (bound to ?TBD?).

3.1 Document Model

The triple production rules view the XML document using the XPath nodeset as the document model.

Of the seven nodes in the XPath data model, only three are used by these rules: the element nodes, the text nodes and the attribute nodes. Processing instruction nodes and comment nodes are stripped prior to RDF/XML processing. Namespace nodes are relevant in as much as they participate in the qualification of the names of the element nodes and attribute nodes. Within an XML Literal value, namespace nodes are significant.

An element occurring as a child of an rdf:RDF element is called a top-level element.

The rules apply only to top-level elements.

3.2 Whitespace Processing

TBD

3.3 Rule Format

Each rule has the following sections:

number, name, comment
left hand side
right hand side
explicit constraints
implicit constraints
XSLT template production

Within the rules variables are shown using a '$' convention.

The syntax of both the left hand side and right hand side are identical, and is that of XML fragments with variables which match XML nodes and values from the XPath nodeset document model.

Variables can match many parts of an XML element.

An element tag e.g. <$tag ...
An attribute value e.g. <rdf:Description rdf:about=$about ...
An optional attribute value e.g. <rdf:Description rdf:about=?$about ...
An attribute qname e.g. <rdf:Description $prop=$value ...
A set of attributes <rdf:Description $attrs… >
A document fragment, $body matches all the (remaining) content, (all node types):
```
<rdf:Description >
  $body…
</rdf:Description>
```
A text node, $text matches only a text node: <rdf:value>$text</rdf:value>

The text node also matches empty elements, giving the empty string. Note that document fragment variables and text node variables are distinguished by the use of ellipsis.

Variables that are used to match element tags or attribute qnames can be subject to both explicit and implicit constraints. These have the form of either requiring or prohibiting a specific namespace or qname. The constraints on a variable that represents a set of attributes apply to each member of the set.

The implicit constraints are generated on attribute qnames by the XML rule that no two attributes of an element should have the same name. Any match that would break this either on the left hand side or the right hand side is excluded. Moreover, if there is an optional match against an attribute that takes precedence over a non-explicit match. Thus in rule 10 below, the attribute set variable $propEltAttrs is implicitly exluded from including an rdf:ID attribute which is explicitly matched by an optional attribute value.

Variables used on the right hand side, must appear on the left hand side.

Normally, variables are of the same type on the right hand side as on the left hand side. Each case where a variable changes type is explicitly noted in the notes. (Not yet done: briefly taking the element tag in the typed node construction to be the value of rdf:resource requires an expansion of the namespace, taking an integer valued attribute value variable in rdf:li and rdf:bagID processing requires the construction of an rdf:_$N qname.)

The XSLT rule, shows the XML tree transformation as an XSLT template production. It provides a different formalization of the rule semantics. It should be noted that:

The uses of the XSLT function generate-id() are intended to generate unique bNode names. An implementation of this set of rules making multiple invocations of XSLT will need to replace this function in order to guarantee that the bNode names are unique within the document.
The XSLT templates cannot be used as is, because XSLT will report a conflict. Any conflict resolution technique may be used to resolve this, including a random one.
An XSLT implementation will also minimally need to define additional rules for the root element (rdf:RDF) and for copying fully simplified top-level elements.
In some parts, the generated XSLT code arbitrarily picks out the first of a set of attributes and distinguishes it. Since the XPath nodeset gives an arbitrary order to sets of attributes this can be any member of the selected set. All such instances are shown in the rules by putting the digit '1' in bold e.g. "[1]"

3.4 Rule Internal Attributes

Some attributes are used from an additional namespace rdfx, (in this version this is bound to http://jcarroll.hpl.hp.com/rdf/rdfx/). These are as follows.

NOTE: Early feedback from Aaron suggests that the number of these attributes is too large. An alternative is to find a different solution to distinguishing bNodeLabels from URIs (e.g. bNodeLabels start "_:", URIs start with a scheme name) and to overload rdf:about and rdf:resource with bNodeLabels as well. This would allow dropping of rdfx:subject, rdfx:object, rdfx:scope, and rdfx:reifyScope.

I am unclear at this stage what the impact of such a change would be on the number of rules.

rdfx:subject: Generalizes rdf:about, rdf:aboutEach, rdf:ID and bNodes.
rdfx:scope: This takes values from "URI", "bNode" and "Collection" and occurs with a related rdfx:subject or rdfx:object. The value "Collection" corresponds to an rdf:aboutEach attribute.
rdfx:object: Generalizes rdf:resource and; rdf:about, rdf:ID and bNodes as objects.
rdfx:liCounter: An integer used for rdf:li processing.
rdfx:bagLiCounter: An integer used for rdf:bagID processing.
rdfx:reify: Generalizes rdf:ID and bNodes on a property element production.
rdfx:reifyScope: This takes values "URI" or "bNode" and occurs with a related rdfx:reify.

4 Simplification Rules

The rules in this section largely correspond to the overview in section 2.

Striping: Striping is processed by only ever processing top-level elements. A flattening rule is used to promote third level elements to the top-level.
description separation: In the striped syntax multiple property elements are permitted. In the output syntax, only one triple per top-level element is permitted, and hence only one property element. The basic separation rule and the rdf:bagID separation rule split up top-level descriptions with multiple property elements.
typedNode: See the typed node rule.
rdf:ID: On top-level elements see the ID rule, the about rule, and the bNode rule.

It is also necessary to process the third level elements with the equivalent pre-flattening ID rule, pre-flattening about rule, and pre-flattening bNode rule.
propAttr: See the general property attribute and rdf property attribute rules.
rdf:type propAttr: See the special rdf:type property attribute rule.
rdf:parseType="Resource": See the rdf:parseType="Resource" rule.
propertyElt with propAttrs, rdf:resource or rdf:bagID: See the property element with rdf:resource and the property element with property attribute rules.
rdf:li processing: The liStart rule initialises a counter; the liCount rule uses it; and the liEnd rule tidies the counter away afterwards.
rdf:aboutEach: treated very similarly to rdf:about with the aboutEach rule.
Reification: The reification ID resolution rule converts a reification ID into a relative URI. This allows uniform processing of this explicit reification with implicit reification caused by rdf:bagID.

The reification quad is created by one of the reification string or reification resource rules.
rdf:bagID: The bagLiStart rule creates a new Bag for collecting reified statement and initializes a membership counter for the Bag. The bagLiEnd rule discards an empty top-level element which is left at the end of bagID processing. The bagID without explicit reification rule triggers implicit reification.The bagID separation rule splits off the next element in a description with an rdf:bagID, adding its reification resource into the Bag.

There is also a tidying up rule that matches spurious empty top-level elements.

4.1 Namespaces Used

TBD

4.2 The Table of Rules


1	PropAttr-1
Match non-rdf attribute, and convert into propertyElt production.
<$tag $prop=$val $attrs…> $body… </$tag>		<$tag $attrs…> <$prop >$val</$prop> $body… </$tag>
When:	$prop!~"rdfx:"; $prop!~"rdf:"; $prop!~"xml:*";
2	PropAttr-2

<$tag rdf:type=$val $attrs…> $body… </$tag>		<$tag $attrs…> <rdf:type rdf:resource=$val/> $body… </$tag>
Implicitly	each $attrs≠"rdf:type";
3	PropAttr-3
Match rdf attribute for which PropAttr-1 rule applies.
<$tag $prop=$val $attrs…> $body… </$tag>		<$tag $attrs…> <$prop >$val</$prop> $body… </$tag>
When:	$prop~"rdf:"; $prop≠"rdf:bagID"; $prop≠"rdf:ID"; $prop≠"rdf:about"; $prop≠"rdf:aboutEach"; $prop*≠"rdf:type";
4	TypedNode
Match typedNode rule and convert into description with rdf:type propertyElt.
<$tag $attrs…> $body… </$tag>		<rdf:Description $attrs…> <rdf:type rdf:resource=$tag/> $body… </rdf:Description>
When:	$tag≠"rdf:Description";
5	about
Map an rdf:about into rdfx:subject.
<$tag rdf:about=$about $attrs…> $body… </$tag>		<$tag rdfx:subject=$about rdfx:scope='URI' $attrs…> $body… </$tag>
Implicitly	each $attrs≠"rdfx:subject"; each $attrs≠"rdf:about"; each $attrs≠"rdfx:scope";
6	aboutEach
Map an rdf:aboutEach into rdfx:subject.
<$tag rdf:aboutEach=$about $attrs…> $body… </$tag>		<$tag rdfx:scope='Collection' rdfx:subject=$about $attrs…> $body… </$tag>
Implicitly	each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope"; each $attrs≠"rdf:aboutEach";
7	bNode
On a description with no identifier, add an rdfx:subject
<$tag $attrs…> $body… </$tag>		<$tag rdfx:subject='_:{generate-id()}' rdfx:scope='bNode' $attrs…> $body… </$tag>
When:	each $attrs≠"rdf:ID"; each $attrs≠"rdf:about"; each $attrs≠"rdfx:subject"; each $attrs≠"rdf:aboutEach";
Implicitly	each $attrs≠"rdfx:scope";
8	ID
Map an rdf:ID into rdfx:subject.
<$tag rdf:ID=$id $attrs…> $body… </$tag>		<$tag rdfx:scope='URI' rdfx:subject='#{$id}' $attrs…> $body… </$tag>
Implicitly	each $attrs≠"rdf:ID"; each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope";
9	parseTypeResource
Match propertyElt with parseType=Resource and simplify.
<$tag $attrs…> <$propElt rdf:ID=?$ID rdf:parseType='Resource' rdfx:reify=?$reify rdfx:reifyScope=?$reifyScope> $propEltBody… </$propElt> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdfx:reify=?$reify rdf:ID=?$ID> <rdf:Description> $propEltBody… </rdf:Description> </$propElt> $body… </$tag>
10	propertyEltWithPropAttrs
Match propertyElt with propAttrs or bagID .
<$tag $attrs…> <$propElt rdf:ID=?$ID rdfx:reify=?$reify rdf:resource=?$about rdfx:reifyScope=?$reifyScope $attr=$value $propEltAttrs…/> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdf:ID=?$ID rdfx:reify=?$reify> <rdf:Description rdf:about=?$about $attr=$value $propEltAttrs…/> </$propElt> $body… </$tag>
When:	$attr!~"rdfx:"; each $propEltAttrs!~"rdfx:";
Implicitly	$attr≠"rdf:ID"; $attr≠"rdf:about"; $attr≠"rdfx:reifyScope"; $attr≠"rdf:resource"; $attr≠"rdfx:reify"; each $propEltAttrs≠"rdf:ID"; each $propEltAttrs≠"rdf:about"; each $propEltAttrs≠"rdfx:reifyScope"; each $propEltAttrs≠"rdf:resource"; each $propEltAttrs≠"rdfx:reify";
11	propertyEltWithResource
Match propertyElt with rdf:resource .
<$tag $attrs…> <$propElt rdf:resource=$about rdf:ID=?$ID rdfx:reifyScope=?$reifyScope rdfx:reify=?$reify $propEltAttrs…/> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdfx:reify=?$reify rdf:ID=?$ID> <rdf:Description rdf:about=$about $propEltAttrs…/> </$propElt> $body… </$tag>
Implicitly	each $propEltAttrs≠"rdf:ID"; each $propEltAttrs≠"rdf:about"; each $propEltAttrs≠"rdfx:reifyScope"; each $propEltAttrs≠"rdf:resource"; each $propEltAttrs≠"rdfx:reify";
12	Flattening
Remove nested description or typedNode from first propertyElt.
<$tag $attrs…> <$propElt rdfx:reify=?$reify rdfx:reifyScope=?$reifyScope rdf:ID=?$ID> <$object rdfx:subject=$about rdfx:scope=$scope $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>		<$object rdfx:subject=$about rdfx:scope=$scope $objAttrs…> $objBody… </$object> <$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdfx:reify=?$reify rdf:ID=?$ID rdfx:object=$about rdfx:scope=$scope/> $body… </$tag>
Implicitly	each $objAttrs≠"rdfx:subject"; each $objAttrs≠"rdfx:scope";
13	PreFlatteningBNode
Add a bNode name to nested description or typedNode of first propertyElt.
<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdf:ID=?$ID rdfx:reify=?$reify> <$object $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdfx:reify=?$reify rdf:ID=?$ID> <$object rdfx:scope='bNode' rdfx:subject='_:{generate-id()}' $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>
When:	each $objAttrs≠"rdf:ID"; each $objAttrs≠"rdf:about"; each $objAttrs≠"rdfx:subject"; each $objAttrs≠"rdf:aboutEach";
Implicitly	each $objAttrs≠"rdfx:scope";
14	PreFlatteningID
Change an rdf:ID to an rdfx:subject in a nested description or typedNode of first propertyElt.
<$tag $attrs…> <$propElt rdfx:reify=?$reify rdf:ID=?$ID rdfx:reifyScope=?$reifyScope> <$object rdf:ID=$nestedID $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdf:ID=?$ID rdfx:reify=?$reify> <$object rdfx:scope='URI' rdfx:subject='#{$nestedID}' $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>
Implicitly	each $objAttrs≠"rdf:ID"; each $objAttrs≠"rdfx:subject"; each $objAttrs≠"rdfx:scope";
15	PreFlatteningAbout
Change an rdf:about to an rdfx:subject in a nested description or typedNode of first propertyElt.
<$tag $attrs…> <$propElt rdf:ID=?$ID rdfx:reify=?$reify rdfx:reifyScope=?$reifyScope> <$object rdf:about=$about $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope=?$reifyScope rdf:ID=?$ID rdfx:reify=?$reify> <$object rdfx:scope='URI' rdfx:subject=$about $objAttrs…> $objBody… </$object> </$propElt> $body… </$tag>
Implicitly	each $objAttrs≠"rdfx:subject"; each $objAttrs≠"rdf:about"; each $objAttrs≠"rdfx:scope";
16	BasicSeparation
Extract first propertyElt from a description, no bagID, not parseType='Literal'.
<$tag rdfx:subject=$subject rdfx:scope=$subjScope $attrs…> <$propElt rdfx:object=?$object rdf:ID=?$ID rdfx:reify=?$reify rdfx:scope=?$objScope rdfx:reifyScope=?$reifyScope >$objectString</$propElt> <$more $moreAttrs…> $moreBody… </$more> $body… </$tag>		<rdf:Description rdfx:scope=$subjScope rdfx:subject=$subject> <$propElt rdfx:reifyScope=?$reifyScope rdfx:object=?$object rdfx:scope=?$objScope rdf:ID=?$ID rdfx:reify=?$reify >$objectString</$propElt> </rdf:Description> <$tag rdfx:subject=$subject rdfx:scope=$subjScope $attrs…> <$more $moreAttrs…> $moreBody… </$more> $body… </$tag>
When:	$propElt≠"rdf:li"; each $attrs≠"rdf:bagID";
Implicitly	each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope";
17	ReificationID2rdfxReify
Convert an rdf:ID to rdfx:reify.
<$tag $attrs…> <$propElt rdf:ID=$ID $moreAttrs…> $moreBody… </$propElt> $body… </$tag>		<$tag $attrs…> <$propElt rdfx:reifyScope='URI' rdfx:reify='#{$ID}' $moreAttrs…> $moreBody… </$propElt> $body… </$tag>
When:	each $attrs≠"rdf:bagID"; $propElt≠"rdf:li";
Implicitly	each $moreAttrs≠"rdf:ID"; each $moreAttrs≠"rdfx:reifyScope"; each $moreAttrs≠"rdfx:reify";
18	ReificationResource
Construct quad when reifying with resource object.
<$tag rdfx:scope=$subjScope rdfx:subject=$subject $attrs…> <$propElt rdfx:scope=$objScope rdfx:reifyScope=$reifyScope rdfx:object=$object rdfx:reify=$reify/> $body… </$tag>		<rdf:Statement rdfx:subject=$reify rdfx:scope=$reifyScope> <rdf:subject rdfx:object=$subject rdfx:scope=$subjScope/> <rdf:predicate rdf:resource=$propElt/> <rdf:object rdfx:object=$object rdfx:scope=$objScope/> </rdf:Statement> <$tag rdfx:subject=$subject rdfx:scope=$subjScope $attrs…> <$propElt rdfx:object=$object rdfx:scope=$objScope/> $body… </$tag>
When:	$propElt≠"rdf:li"; each $attrs≠"rdf:bagID";
Implicitly	each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope";
19	ReificationString
Construct quad when reifying with string object.
<$tag rdfx:subject=$subject rdfx:scope=$subjScope $attrs…> <$propElt rdfx:reify=$reify rdfx:reifyScope=$reifyScope >$object</$propElt> $body… </$tag>		<rdf:Statement rdfx:subject=$reify rdfx:scope=$reifyScope> <rdf:subject rdfx:object=$subject rdfx:scope=$subjScope/> <rdf:predicate rdf:resource=$propElt/> <rdf:object >$object</rdf:object> </rdf:Statement> <$tag rdfx:scope=$subjScope rdfx:subject=$subject $attrs…> <$propElt >$object</$propElt> $body… </$tag>
When:	$propElt≠"rdf:li"; each $attrs≠"rdf:bagID";
Implicitly	each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope";
20	liStart
Initialize liCounter
<$tag $attrs…> <rdf:li $moreAttrs…> $moreBody… </rdf:li> $body… </$tag>		<$tag rdfx:liCounter='1' $attrs…> <rdf:li $moreAttrs…> $moreBody… </rdf:li> $body… </$tag>
Implicitly	each $attrs≠"rdfx:liCounter";
21	liCount
Use liCounter
<$tag rdfx:liCounter=$N $attrs…> <rdf:li $moreAttrs…> $moreBody… </rdf:li> $body… </$tag>		<$tag rdfx:liCounter='{$N+1}' $attrs…> <$N $moreAttrs…> $moreBody… </$N> $body… </$tag>
Implicitly	each $attrs≠"rdfx:liCounter";
22	liEnd
Remove liCounter
<$tag rdfx:liCounter=$N $attrs…> <$prop $moreAttrs…> $moreBody… </$prop> </$tag>		<$tag $attrs…> <$prop $moreAttrs…> $moreBody… </$prop> </$tag>
When:	$prop≠"rdf:li";
Implicitly	each $attrs≠"rdfx:liCounter";
23	bagLiStart
Initialize counter for bagId, declare bag.
<$tag rdf:bagID=$bagID $attrs…> $body… </$tag>		<rdf:Bag rdf:ID=$bagID/> <$tag rdf:bagID=$bagID rdfx:bagLiCounter='1' $attrs…> $body… </$tag>
When:	each $attrs≠"rdfx:bagLiCounter";
Implicitly	each $attrs≠"rdf:bagID";
24	bagLiEnd
Drop completed bagId.
<rdf:Description rdfx:scope=$subjScope rdfx:subject=$subject rdf:bagID=$bagID rdfx:liCounter=?$M rdfx:bagLiCounter=$N/>
25	bagIDwithoutReification
In a description with bagID, the first propElt is implicitly reified, if necessary.
<$tag rdf:bagID=$bagID $attrs…> <$more $moreAttrs…> $moreBody… </$more> $body… </$tag>		<$tag rdf:bagID=$bagID $attrs…> <$more rdfx:reifyScope='bNode' rdfx:reify='_:{generate-id()}' $moreAttrs…> $moreBody… </$more> $body… </$tag>
When:	each $moreAttrs≠"rdf:ID"; each $moreAttrs≠"rdfx:reify";
Implicitly	each $attrs≠"rdf:bagID"; each $moreAttrs≠"rdfx:reifyScope";
26	bagIDSeparation
Separate out the first element of a description node with bagID.
<$tag rdfx:scope=$subjScope rdfx:bagLiCounter=$N rdf:bagID=$bagID rdfx:subject=$subject $attrs…> <$propElt rdfx:reify=$reify rdfx:reifyScope=$reifyScope $moreAttrs…> $moreBody… </$propElt> $body… </$tag>		<rdf:Description rdfx:scope=$subjScope rdfx:subject=$subject> <$propElt rdfx:reify=$reify rdfx:reifyScope=$reifyScope $moreAttrs…> $moreBody… </$propElt> </rdf:Description> <rdf:Description rdfx:scope='URI' rdfx:subject='#{$bagID}'> <$N rdfx:object=$reify rdfx:scope=$reifyScope/> </rdf:Description> <$tag rdfx:bagLiCounter='{$N+1}' rdfx:scope=$subjScope rdf:bagID=$bagID rdfx:subject=$subject $attrs…> $body… </$tag>
When:	$propElt≠"rdf:li";
Implicitly	each $moreAttrs≠"rdfx:reifyScope"; each $moreAttrs≠"rdfx:reify"; each $attrs≠"rdf:bagID"; each $attrs≠"rdfx:subject"; each $attrs≠"rdfx:scope"; each $attrs≠"rdfx:bagLiCounter";
27	tidyingUp
Remove superfluous rdf:Description's.
<rdf:Description rdfx:scope=$subjScope rdfx:subject=$subj/>

4.2.1 Notes

5 Distributed References

After all simplifications we process the simplified XML document by looking for all triples with rdfx:scope="Collection" and all triples with predicate rdf:_NNN, with an rdfx:object and a matching rdfx:subject.

I will produce some XSLT for saying this ....

TBD

6 Triple Production Examples

TBD

7 Clarifications of Model&Syntax

TBD

A References

N-Triples: World Wide Web Consortium. In RDF Test Cases, W3C Working Draft, See http://www.w3.org/TR/rdf-testcases/#ntriples, 12 September 2001.
RDF M&S: World Wide Web Consortium. Resource Description Framework (RDF) Model and Syntax Specification W3C Recommendation. See http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/, 22 February 1999.
Refactoring RDF: World Wide Web Consortium. Refactoring RDF/XML Syntax W3C Working Draft. See http://www.w3.org/TR/2001/WD-rdf-syntax-grammar-20010906/
XML: World Wide Web Consortium. Extensible Markup Language (XML) 1.0. W3C Recommendation. See http://www.w3.org/TR/1998/REC-xml-19980210
XML Names: World Wide Web Consortium. Namespaces in XML. W3C Recommendation. See http://www.w3.org/TR/REC-xml-names
XPath: World Wide Web Consortium. XML Path Language. W3C Recommendation. See http://www.w3.org/TR/xpath
XSLT: World Wide Web Consortium. XSL Transformations (XSLT) W3C Recommendation. See http://www.w3.org/TR/xslt

RDF/XML Syntax — Triple Production

Draft 19 October 2001

Abstract

Status of this Document

Table of Contents

1 Triple Production Objectives

2 RDF/XML Syntax and Triples

2.1 RDF/XML Striped Syntax

2.2 RDF/XML Advanced Syntax

2.2.1 Striping

2.2.2 Abbreviations

2.2.3 Collection Membership

2.2.4 Reification and bagID

BagID

Reification

2.2.5 Distributed Subjects

3 Triple Production Overview

3.1 Document Model

3.2 Whitespace Processing

3.3 Rule Format

3.4 Rule Internal Attributes

4 Simplification Rules

4.1 Namespaces Used

4.2 The Table of Rules

4.2.1 Notes

5 Distributed References

6 Triple Production Examples

7 Clarifications of Model&Syntax

A References