W3C

RDF/XML Syntax Revised Specification

W3C Working Draft (Editors Draft) XX Month 2001

This version:
http://ilrt.org/discovery/2001/07/rdf-syntax-grammar/
$Revision: 1.133 $
Latest Published W3C Working Draft version:
http://www.w3.org/TR/rdf-syntax-grammar/
Previous versions:
CVS history
Editor:
Dave Beckett (University of Bristol)

Abstract

This is an Editors Draft of the RDF Core WG Working Draft and describes the XML syntax of the RDF model as described in RDF Model & Syntax [RDFMS] after amendments and clarifications from the RDF Core WG.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.

This is an Editors Draft of the latest version of the W3C RDF Core WG Working Draft for the RDF Core Working Group produced as part of the W3C Semantic Web Activity. It incorporates decisions made by the Working Group updating the XML syntax for RDF from the original RDF Model & Syntax ([RDFMS]) document and includes a re-representing of the syntax in terms of the XML Information Set with rules for generation of RDF models.

This document is being released for review by W3C members and other interested parties to encourage feedback and comments, especially with regard to how the changes affect existing implementations. This is the current state of an ongoing work on the syntax and mapping process and may not yet record all of the work in the grammar section of the original document.

This is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use it as reference material or to cite as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Comments on this document are invited and should be sent to the public mailing list www-rdf-comments@w3.org. An archive of comments is available at http://lists.w3.org/Archives/Public/www-rdf-comments/.

Table of contents

1 Introduction
2 An XML syntax for the RDF graph
3 Data Model
  3.1 Root Node
  3.2 Element Node
  3.3 End Element Node
  3.4 Attribute Node
  3.5 Text Node
  3.6 Identifier Node
  3.7 Information Set Mapping
4 Notation
  4.1 Terminology
  4.2 Grammar Notation
  4.3 Notation Forms
  4.4 The RDF Namespace
5 RDF/XML Grammar
6 Serialising an RDF Graph to RDF/XML
7 Acknowledgements
8 References

Appendices

A Issues affecting RDF/XML Syntax (Non-Normative)
  A.1 Document Issues / Tasks (Non-Normative)
  A.2 RDF Core WG Open Issues affecting RDF/XML Syntax (Non-Normative)
  A.3 RDF Core WG Decided Issues affecting RDF/XML Syntax (Non-Normative)
  A.4 RDF Core WG Postponed Issues affecting RDF/XML Syntax (Non-Normative)
B Syntax Schemas (Non-Normative)
  B.1 RELAX NG Syntax Schema (Non-Normative)
  B.2 Other Syntax Schemas (Non-Normative)
C Original Grammar (Non-Normative)
D Updated Grammar after RDF Core decisions (Non-Normative)


1 Introduction

This document describes the XML ([XML]) syntax for RDF as originally defined in the RDF Model & Syntax ([RDFMS]) W3C Recommendation. Subsequent implementations of this syntax and comparison of the resulting RDF models have shown that there was ambiguity - implementations generated different models and certain syntax forms were not widely implemented. These issues were generally made as either feedback to the www-rdf-comments@w3.org (archive) or from discussions on the RDF Interest Group list www-rdf-interest@w3.org (archive) .

The RDF Core Working Group is chartered to respond to the need for a number of fixes, clarifications and improvements to the specification of RDF's abstract model and XML syntax. The working group invites feedback from the developer community on the effects of its proposals on existing implementations and documents.

Several decisions including amendments and deletions to the grammar are refered to below. The definitive record of the decisions is the RDF Core WG issues list.

This document re-represents the original EBNF grammar in terms of the XML Information Set ([INFOSET]) items which moves from the rather low-level details, such as particular forms of empty elements. This allows the grammar to be more precisely recorded and the mapping from the XML syntax to the RDF model more clearly shown. The mapping to the RDF model (a graph) is done by emitting statements in the form defined in the N-Triples section of RDF Test Cases ([RDF-TESTS]) Working Draft which creates an RDF model, that has semantics defined by RDF Model Theory ([RDF-MODEL]) Working Draft.

2 An XML syntax for the RDF graph

The RDF Model Theory ([RDF-MODEL]) is a graph consisting of nodes describing resources that can be labelled with URIs, string literals or are blank and arcs connecting the nodes that are all labelled with URIs. This graph is more precisely called an directed edge-labelled graph; each edge is an arc with a direction (an arrow) connecting two nodes. These edges can be described as triples of subject node, at the blunt end of the arrow/arc, property arc and an object node at the sharp end of the arrow/arc. The property arc is also interpreted as an attribute, relationship or predicate of the resource with a value given by the object node content.

In order to encode the graph in XML, the nodes and arcs are turned into XML elements, attributes, element content and attribute values. The URI labels for properties and object nodes are written in XML via XML Namespaces ([XML-NS]) which gives a namespace URI for a short prefix along with namespace-qualified elements and attributes names called local names. The (namespace URI, local name) pair are chosen such that concatenating them forms the original node URI. The URIs labelling subject nodes are stored in XML attribute values. The nodes labelled by string literals (which are always object nodes) become element text content or attribute values.

This transformation turns sequences of Node, Arc, Node, Arc, Node, Arc, ... into sequences of elements inside elements. This results in a striping when the elements are written down; alternating between node elements and property elements. The Node at the start of the sequence is always a subject node and turns into a containing element called an rdf:Description that is written at the top level of RDF/XML, below the XML document element (in this case rdf:RDF). So the chains of stripes start at the top of an RDF/XML document and always begin with nodes.

There are several abbreviations that can be used to make very common uses more easy to write down. It is typical for the same resource to be described with multiple properties and values at the same time, so multiple child elements can be put inside rdf:Description, all of which are properties of that node.

When the property value is a string it can be encoded more simply as an XML attribute and value, as an attribute of the node element. This is known as a property attribute.

Another very common use is when a node is an instance of a class with rdf:type relationship, usually called a typed node. This shorthand is done by replacing the rdf:Description element with the namespaced-element corresponding to the URI of the value of the type relationship.

The above forms the basis of the RDF/XML syntax and although there are some other abbreviated forms, such as for generating the RDF list properties and form for skipping having to write down a blank element node, which breaks the striping but is useful for, amongst other uses, encoding properties with multiple-values.

For a longer introduction to the RDF/XML striped syntax with a historical perspective, see RDF: Understanding the Striped RDF/XML Syntax ([STRIPEDRDF]).

3 Data Model

This syntax operates on an XML document as a sequence of nodes in document order in the style of [XPATH]  Information Set Mapping serialised into document-order. The resulting nodes are intended to be similar to the events that are produced by the [SAX2] XML API. This model is conceptual only and does not mandate any implementation method; in particular [XPATH] is not required.

The syntax does not support non-well-formed XML documents, nor documents that otherwise don't have an XML Information Set; for example, that don't conform to XML Namespaces W3C Recommendation ([XML-NS]).

This specification requires an information set as defined in [INFOSET] which supports at least the following information items and properties:

Document Information Item
[document element], [children], [base URI]
Element Information Item
[local name, [namespace name], [children], [attributes], [parent]
Attribute Information Item
[local name], [namespace name], [normalized value], [owner element]
Character Information Item
[character code]

This specification does not require any destructive alterations to the input information set; no items are added, removed or modified..

This section is intended to satisfy the requirements for Conformance in the [INFOSET] specification.

There are six types of node defined in the following subsections. Most nodes are constructed from an Infoset information item (except for Identifier). The effect of a node constructor is to create a new node with a unique identity, distinct from all other nodes. Nodes have properties, and all have the string-value property that may be part of the node ro computed from the string-value of contained nodes.

3.1 Root Node

Created from an Document Information Item and takes the following properties and their values from the element information item: document-element, children and base-uri.

3.2 Element Node

Created from an Element Information Item and takes the following properties and their values from the element information item: local-name, namespace-name, children, attributes and parent. When this node is created from such values, the URI property is defined with a string value of the concatenation of the value of the namespace-name property and the value of the local-name property. On creation the li-counter property is added with initial integer value 1.

The subject property may be added and takes the value of an Identifier node. This is used on elements that deal with one node in the RDF model, this generally being the subject of a statement.

3.3 End Element Node

Takes no properties but marks the end of the containing element in the sequence.

3.4 Attribute Node

Created from an Attribute Information Item and takes the properties local-name, namespace-name and owner element and their values from respective element information item properties. When this node is created from such values, two properties and values are defined. Firstly the string-value property is defined with the normalized value as specified by [XML]. An attribute whose normalized value is a zero-length string is not treated specially: it results in an attribute node whose string-value is a zero-length string. Secondly the URI property is defined with a string value of the concatenation of the value of the namespace-name property and the value of the local-name property.

3.5 Text Node

Created from a sequence of one or more consequtive Character Information Items. Has the single property string-value which has the value of the string made from concatenating the character code property of each of the character information items. [NOTE: Identical to XPath.]

3.6 Identifier Node

A node for a typed identifer which can have the following three properties: identifier and identifier-type and string-value. These nodes are created by giving two values for the for the identifier and identifier-type properties. The identifier property takes a string value and the identifier-type property can take values "URI" or "bnodeID".

The string-value property is defined from the other properties as follows: If identifier-type is "URI" then the value is the concatenation of "<", the value of the identifier property and ">". If identifier-type is "bnodeID" then the value is the concatenation of "_:" and the value of the identifier property.

3.7 Information Set Mapping

To transform the Infoset into the sequence of nodes, each information item is transformed as described above to generate a tree of nodes with properties and values. Each element node is then replaced as described below to turn the tree of nodes into a sequence in document order.

  1. The original element node
  2. The value of the children property, a possibly empty ordered list of nodes.
  3. An end element node

4 Notation

4.1 Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 ([KEYWORDS]).

4.2 Grammar Notation

The following notation is used for describing the nodes and grammar EBNF.

Notation for nodes and grammar EBNF.
Notation Meaning
property=value A node property with a given value
root(prop1=value1,
    prop2=value2, ...)
A root node with properties
start_element(prop1=value1,
    prop2=value2, ...)
children
end_element()
A sequence of element node with properties, a possibly empty list of nodes as element content and an end element node
attribute(prop1=value1,
    prop2=value2, ...)
An attribute node with properties
identifier(prop1=value1,
    prop2=value2, ...)
An identifier node with properties
text() A text node
base-uri The value of the base-uri property of the root node
list(item1, item2, ...); list() An ordered list of items in document order; an empty list
set(item1, item2, ...); set() An unordered set of items; an empty set
* Zero or more of preceding term
? Zero or one of preceding term
+ One or more of preceding term
A | B | ... The A, B, ... terms are alternatives.
A - B The term A but not the term B
"ABC" A string of characters A, B, C in order.
concat(A, B, ..) A string created by concatenating the terms in order.
anyURI Any legal URI.
anyString Any string.
rdf:X The URI formed by concatenating the RDF Namespace URI with "X"

4.3 Notation Forms

The following notation forms are used to indicate

A grammar production over a sequence of nodes derived from the Infoset in the notation described in section 4.2.

 

A sequence of lines of N-Triples output from a grammar production adding to an RDF model.

4.4 The RDF Namespace

The RDF Namespace URI is http://www.w3.org/1999/02/22-rdf-syntax-ns#

The namespace contain the following names only:

RDF Description
ID about bagID parseType resource
type li _n

where n is a non-negative integer.

Implementors Note: The names aboutEach and aboutEachPrefix were removed from the language by the RDF Core WG - see the issues rdfms-abouteach and rdfms-abouteachprefix for further information.

5 RDF/XML Grammar

5.1 Grammar start

If the RDF/XML is a standalone XML content, then the grammar starts with Root Node  doc.

If the content is known to be RDF/XML by context, such as when RDF/XML is embedded inside other XML content, then the grammar can either start at Element Node  RDF (only when an element is legal at that point in the XML) or at production nodeElementList (only when element content is legal, since this is a list of elements). For such embedded RDF/XML, the base-uri value must be initialised from the containing XML since no Root Node  will be available. Note that if such embedding ocurrs, the grammar may be entered several times but no state is expected to be preserved.

5.2 Production doc

root(document-element=RDF,
    children=list(RDF))

5.3 Production RDF

start_element(URI = rdf:RDF,
    attributes=set())
nodeElementList
end_element()

5.4 Production nodeElementList

ws* (nodeElement ws* )*

5.5 Production nodeElement

start_element(URI=anyURI,
    attributes=set((idAttr | aboutAttr )?, bagIdAttr?, propertyAttr*))
propertyEltList
end_element()

The processing of some of the attributes have to be done before other work such as dealing with children nodes or other attributes. These can be processed in any order:

Next, if there is a propertyAttr attribute a on element e with a.URI = rdf:li then apply the list expansion rules on element e in section 5.27 to generate a new URI u, and set the value of a.URI to be u.

The following can then be performed in any order:

If an attribute a with a.URI = rdf:bagID is present, create a new node n = identifier(identifier=concat(base-uri, "#", a.string-value), identifier-type="URI") and add the following statement to the model:

n.string-value <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> .

Then for all statements generated above (except the previous statement) are reified with node n using the reification rules in section 5.26.

5.6 Production ws

White space as defined by [XML] definition White Space Rule [3] S in section Common Syntactic Constructs

5.7 Production propertyEltList

ws* (propertyElt ws* ) *

5.8 Production propertyElt

resourcePropertyElt | literalPropertyElt | parseTypeLiteralPropertyElt | parseTypeResourcePropertyElt | parseTypeOtherPropertyElt | emptyPropertyElt

If element e has e.URI = rdf:li then apply the list expansion rules on element e.parent in section 5.27 to give a new URI u and set the value of e.URI to be u.

Note: It is expected that in future the number of values of the rdf:parseType attribute will increase, and those values will probably be XML QNames.

5.9 Production resourcePropertyElt

start_element(URI=anyURI,
    attributes=set(idAttr?))
nodeElement
end_element()

For element e, and the single contained nodeElement n the following statement is added to the model:

   e.parent.subject.string-value <e.URI> n.subject.string-value; .

If the rdf:ID attribute a is given, the above statement is reified with identifier(identifier=concat(base-uri, "#", a.string-value), identifier-type="URI") using the reification rules in section 5.26.

5.10 Production literalPropertyElt

start_element(URI=anyURI,
    attributes=set(idAttr?))
text()
end_element()

Note: The empty literal case is defined in production emptyPropertyElt

For element e, and the text node t the following statement is added to the model:

e.parent.subject.string-value <e.URI> "t.string-value" .

If the rdf:ID attribute a is given, the above statement is reified with identifier(identifier=concat(base-uri, "#", a.string-value), identifier-type="URI") using the reification rules in section 5.26.

5.11 Production parseTypeLiteralPropertyElt

start_element(URI=anyURI,
    attributes=set(idAttr?, parseLiteral))
literal
end_element()

For element e and the literal l, if l is empty then the statement object value is "" and the following statement is added to the model:

e.parent.subject.string-value <e.URI> "" .

Test: Indicated by test009.rdf and test009.nt

Otherwise, the following statement is added to the model:

e.parent.subject.string-value <e.URI> l.string-value .

If the rdf:ID attribute a is given, the above statement is reified with identifier(identifier=concat(base-uri, "#", a.string-value), identifier-type="URI") using the reification rules in section 5.26.

Open Issue: The result of a literal from rdf:parseType="Literal" content has not yet been decided by the RDF Core WG; it is dependent on the resolution of several open issues. One possible method would be to serialise it into a string but that has several problems including use of namespaces. Another could be to use the XML Canonicalisation W3C Recommendation but since this document is a revision of an earlier syntax, it may be difficult to require the use of this newer standard.

5.12 Production parseTypeResourcePropertyElt

start_element(URI=anyURI,
    attributes=set(idAttr?, parseResource))
propertyEltList
end_element()

Generate a local identifier i and use it to create a new node n with the value of identifier(identifier=i, identifier-type="bnodeID").

Add the following statement to the model:

e.parent.subject.string-value <e.URI> n.string-value .

Test: Indicated by test004.rdf and test004.nt

If the rdf:ID attribute a is given, the statement above is reified with identifier(identifier=concat(base-uri, "#", a.string-value), identifier-type="URI") using the reification rules in section 5.26.

If the element content c is not an empty, then use node n to create a new sequence of nodes as follows:

start_element(URI=rdf:Description,
    subject=n,
    attributes=set(bagIdAttr=a)
c
end_element()

(bagIdAttr is only set if a is given). Then process the resulting sequence using production nodeElement.

5.13 Production parseTypeOtherPropertyElt

start_element(URI=anyURI,
    attributes=set(idAttr?, parseOther))
propertyEltList
end_element()

The processing of rdf:parseType string values other than "Resource" or "Literal" is currently to treat the content as if it were "Literal". Processing MUST then continue at production parseTypeLiteralPropertyElt.

Note: It is RECOMMENDED, but not REQUIRED that the rdf:parseType value is made available to user applications, possibly as part of the literal value. This note depends on the resolution of some open RDF Core WG issues so may be clarified futher in future drafts.

5.14 Production emptyPropertyElt

start_element(URI=anyURI,
    attributes=set((idAttr | resourceAttr)?, bagIdAttr?, propertyAttr*))
end_element()

Choose one of the following combinations of allowed attributes. Note in particular that rdf:ID and rdf:resource are alternatives, or both can be omitted and furthermore that bagID cannot be used when there are no propertyAttr given.

5.15 Production idAttr

attribute(URI = rdf:ID
    string-value=rdf-id)

5.16 Production aboutAttr

attribute(URI = rdf:about
    string-value=URI-reference)

5.17 Production bagIdAttr

attribute(URI = rdf:bagID
    string-value=rdf-id)

5.18 Production propertyAttr

attribute(URI=anyURI - ( rdf:RDF | rdf:Description | rdf:ID | rdf:about | rdf:bagID | rdf:parseType | rdf:resource ),
    string-value=anyString)

5.19 Production resourceAttr

attribute(URI = rdf:resource
    string-value=URI-reference)

5.20 Production parseLiteral

attribute(URI = rdf:parseType
    string-value="Literal")

5.21 Production parseResource

attribute(URI = rdf:parseType
    string-value="Resource")

5.22 Production parseOther

attribute(URI = rdf:parseType
    string-value=anyString - ("Resource" | "Literal") )

5.23 Production URI-reference

CDATA interpreted as a URI reference defined in Uniform Resource Identifiers (URI) ([URIS]) BNF production URI-reference.

5.24 Production literal

Any XML element content that is allowed according to [XML] definition Content of Elements Rule [43] content. in section 3.1 Start-Tags, End-Tags, and Empty-Element Tags

5.25 Production rdf-id

CDATA matching any legal [XML] token Nmtoken

ISSUE: Should this be changed from any legal XML Nmtoken to be the same as that for XML IDs? In [XML] XML IDs must match Validity constraint: ID which requires the identifiers to match the Name production - a more restricted identifier than Nmtoken.

5.26 Reification Rules

For a statement with terms s, p and o corresponding to the N-Triples:

s p o .

add the following statements to the model using the given Identifier Node r:

r.string-value <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> s .
r.string-value <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> p .
r.string-value <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> o .
r.string-value <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .

5.27 List Expansion Rules

For the given element e, generate a new URI u with value concat("http://www.w3.org/1999/02/22-rdf-syntax-ns#_", e.li-counter) property, increment the value of the e.li-counter property by 1 and return u.

6 Serialising an RDF Graph to RDF/XML

It is not possible for all graphs that can be expressed in the RDF Model Theory ([RDF-MODEL]) to be encoded in this syntax. If you do a round trip from RDF/XML to RDF graph and then back to RDF/XML the meaning will be the same but don't expect the RDF/XML that comes out to be exactly the same.

There are two different approaches to serializing RDF.

The basic approach uses the basic RDF syntax from [RDFMS]. In this:

The basic serialization is recommended for applications in which the output RDF/XML is to be used only in further RDF processing. Where the intent is for the output RDF/XML file to be read by people, the basic serialization proves unsatisfactory. The basic serialization does not conform to more restricted sub-dialects of RDF, such as RSS[RSS] or CC/PP[CC/PP]. Hence, it is not appropriate for such applications, for which dialect specific serializers are needed.

If more human readable output is needed the following factors should be considered:

It is not possible to use the RDF/XML serialization for serializing an RDF graph in which any triple has a property label which cannot be expressed as a qname.

An approach to serializing RDF/XML using the full grammar in a top-down recursive descent fashion is discussed in [UNPARSING].

7 Acknowledgements (Informative)

The following people provided valuable contributions to the document:

8 References

Normative References

[RDFMS]
Resource Description Framework (RDF) Model and Syntax Specification, O. Lassila and R. Swick, Editors. World Wide Web Consortium. 22 February 1999. This version is http://www.w3.org/TR/1999/REC-rdf-syntax-19990222. The latest version of RDF M&S is available at http://www.w3.org/TR/REC-rdf-syntax.
[XML]
Extensible Markup Language (XML) 1.0, Second Edition, T. Bray, J. Paoli, C.M. Sperberg-McQueen and E. Maler, Editors. World Wide Web Consortium. 6 October 2000. This version is http://www.w3.org/TR/2000/REC-xml-20001006. latest version of XML is available at http://www.w3.org/TR/REC-xml.
[XML-NS]
Namespaces in XML, T. Bray, D. Hollander and A. Layman, Editors. World Wide Web Consortium. 14 January 1999. This version is http://www.w3.org/TR/1999/REC-xml-names-19990114. The latest version of Namespaces in XML is available at http://www.w3.org/TR/REC-xml-names.
[INFOSET]
XML Information Set, J. Cowan and R. Tobin, Editors. World Wide Web Consortium. 24 October 2001. This version is http://www.w3.org/TR/2001/REC-xml-infoset-20011024. The latest version of XML Information set is available at http://www.w3.org/TR/xml-infoset.
[URIS]
RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax, T. Berners-Lee, R. Fielding and L. Masinter, IETF, August 1998. This document is http://www.isi.edu/in-notes/rfc2396.txt.
[RDF-TESTS]
RDF Test Cases, A. Barstow and D. Beckett, Editors. Work in progress. World Wide Web Consortium, 15 November 2001. This version of the RDF Test Cases is http://www.w3.org/TR/2001/WD-rdf-testcases-20011115/. The latest version of the RDF Test Cases is at http://www.w3.org/TR/rdf-testcases.
[RDF-MODEL]
RDF Model Theory, P. Hayes, Editor. Work in progress. World Wide Web Consoritum, 25 September 2001. This version of the RDF Model Theory is http://www.w3.org/TR/2001/WD-rdf-mt-20010925. The latest version of the RDF Model Theory is at http://www.w3.org/TR/2001/WD-rdf-mt.
[KEYWORDS]
RDF 2119 - Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, IETF. March 1997. This document is http://www.ietf.org/rfc/rfc2119.txt.

Informational References

STRIPEDRDF
RDF: Understanding the Striped RDF/XML Syntax, D. Brickley, W3C, 2001. This document is http://www.w3.org/2001/10/stripes/.
XPATH
XML Path Language (XPath) Version 1.0, J. Clark and S. DeRose, Editors. World Wide Web Consortium, 16 November 1999. This version of XPath is http://www.w3.org/TR/1999/REC-xpath-19991116. The latest version of XPath is at http://www.w3.org/TR/xpath.
SAX2
SAX Simple API for XML, version 2, D. Megginson, SourceForge, 5 May 2000. This document is http://sax.sourceforge.net/.
RSS
RDF Site Summary (RSS) 1.0, G. Beged-Dov, D. Brickley, R. Dornfest, I. Davis, L. Dodds, J. Eisenzopf, D. Galbraith, R.V. Guha, K. MacLeod, E. Miller, A. Swartz, E. van der Vlist, 2000. This document is http://purl.org/rss/1.0/spec.
CC/PP
Composite Capability/Preference Profiles (CC/PP): Structure and Vocabularies, G. Klyne, F. Reynolds, C. Woodrow, H. Ohto, World Wide Web Consoritum Working Draft, 15 March 2001. This version is http://www.w3.org/TR/2001/WD-CCPP-struct-vocab-20010315/. The latest version of CC/PP structure and Vocabularies is available at http://www.w3.org/TR/CCPP-struct-vocab.
UNPARSING
Unparsing RDF/XML, J. J. Carroll, HP Labs Technical Report, HPL-2001-294, 2001
RELAXNG
RELAX NG Specification, James Clark and MURATA Makoto, editors, OASIS, 3 December 2001. This version of RELAX NG is http://www.oasis-open.org/committees/relax-ng/spec-20011203.html. The latest is at http://relaxng.org/.
RELAXNG-NX
RELAX NG Non-XML Syntax, James Clark, 3 December 2001. This document is http://www.thaiopensource.com/relaxng/nonxml/.
XML Schema Part 0: Primer
XML Schema Part 0: Primer - W3C Recommendation, World Wide Web Consortium, 2 May 2001.
XML Schema Part 1: Structures
XML Schema Part 1: Structures - W3C Recommendation, World Wide Web Consortium, 2 May 2001.
XML Schema Part 2: Datatypes
XML Schema Part 2: Datatypes - W3C Recommendation, World Wide Web Consortium, 2 May 2001.
Schematron
Schematron, Rick Jelliffe, Academia Sinica Computing Centre, Taibei.

Appendix A: Issues affecting RDF/XML Syntax (Non-Normative)

This section records local issues to be resolved and issues that were reported to the RDF Core WG related to the XML syntax and their disposition. This section is not the definitive list or description of the latter - see the RDF Core WG issues list. Decided issues may also have associated test cases which can be found in the RDF Test Cases W3C Working Draft.

A.1: Document Issues / Tasks (Non-Normative)

task-striping

Add an introductory section on how the syntax works. A more descriptive version of Dan Brickley's RDF: Understanding the Striped RDF/XML Syntax

A.2: RDF Core WG Open Issues affecting RDF/XML Syntax (Non-Normative)

rdfms-nested-bagIDs

What triples are generated for nested description elements with bagIDs?

Action: Nested description elements with bagIDs generate the same triples as top-level description elements with bagIDs. Specifically triples generated as a result of the parent propertyElt element do not get reified and included in the bag.

rdfms-replace-value

Suggestion that the rdf:value property be replaced by rdf:toString.

Action: This will not be changed in the syntax but the usage of the rdf:value property will be described in the RDF primer and/or Model Theory.

rdfms-xml-literal-namespaces

How should a parser process namespaces in a literal which is XML markup?

Action: ?

rdfms-qname-uri-mapping

The mapping of QNames to URI's generates incorrect URI's.

Action: The algorithm to generate URIs for RDF concepts cannot be changed in the current syntax without breaking existing applications. To address the specific example in the issue of XML Schemas, RDF applications can use the namespace URI http://www.w3.org/2000/10/XMLSchema# in RDF/XML to generate the correct XML Schema concept URIs for properties and classes etc. This approach has been successfully with the DAML+OIL using RDF, RDFS and DAML terms along with XML schema data types.

rdfms-validating-embedded-rdf

RDF embedded in XHTML and other XML documents is hard to validate

Action: Will not be addressed in the current RDF/XML syntax since it is likely to require changes that would not be backwards compatible. Some help with validation can be found with the schemas for XML validation in Appendix D.

rdfms-xml-base

How does xml-base affect RDF

Action: ?

mime-types-for-rdf-docs

What mime type should RDF Schema and other RDF documents have?

Action: This document [will] defines the syntax for Internet Media Type (or MIME Type) for application/rdf+xml and the registration of this type will be done when this document is stable. NOTE: This is an unregistered type at this time and should not be used in applications. See also the Draft for RDF Media Type registration by Aaron Swartz.

rdfms-reification-required

MUST a parser created bags of reified statements for all Description elements?

Action: No, only those which are explicitly reified using an rdf:ID on a propertyElt or by an rdf:bagID on the description.

rdfms-not-id-and-resource-attr

The propertyElt production 6.12 of the grammar does not allow both an ID attribute and a resource attribute to be specified.

Action: Action: The grammar has[will be] been modified to forbid the use of an rdf:ID attribute on an empty property element. This is consistent with using rdf:ID="attr" as an abbreviation for rdf:about="#attr" and removes the suggestion that it reifys a statement, which it never did in the original grammar form.

rdfms-difference-between-ID-and-about

What is the difference between using and ID attribute to 'create' a new resource and an about attribute to refer to it?

Action: rdf:ID="attr" is an abbreviation for rdf:about="#attr" and the handling of rdf:ID has been[will be] updated to show this.

A.3: RDF Core WG Decided Issues affecting RDF/XML Syntax (Non-Normative)

rdf-ns-prefix-confusion

On 25th May 2001, the WG decided that ALL attributes must be namespace qualified. There is a description of the decision, including detail on the grammar productions affected and a collection of test cases

Action: Removal of original grammar productions 6.6, 6.7, 6.8, 6.9, 6.11, 6.18, 6.32, 6.33

rdfms-abouteachprefix

On 1st June 2001, the WG decided that aboutEachPrefix would be removed from the RDF Model and Syntax Recommendation on the grounds that there is a lack of implementation experience, and it therefore should not be in the recommendation. A future version of RDF may consider support for this feature.

Action: Removal of original grammar production 6.8

rdf-containers-syntax-ambiguity
rdf-containers-syntax-vs-schema

On 29th June 2001, the WG decided that containers will match the typed node production in the grammar (production 6.13) and that the container specific productions (productions 6.25 to 6.31) and any references to them be removed from the grammar. rdf:li elements will be translated to rdf:_nnn elements when they are found matching either a propertyElt (production 6.12) or a a typedNode (production 6.13). The decision includes a set of test cases.

Action: Removal of original grammar productions 6.25, 6.26, 6.27, 6.28, 6.29, 6.30, 6.31

rdfms-empty-property-elements

On 8th June 2001 the WG decided how empty property elements should be interpreted. The decision is fully represented by the test cases.

Action: Inserted pointers to the the test cases into the grammar at the places where empty property elements are recognised.

rdfms-aboutEach-on-object

On 29th June 2001, the WG decided that rdf:aboutEach attributes are not allowed on an rdf:Description (or typed node) element which is the object of a statement.

Action: None needed - rdf:aboutEach removed from the language on 7th December 2001.

rdfms-syntax-desc-clarity

The language describing the syntax is unclear [in section 6]

On 26th October 2001, the WG decided that this issue is closed by the new approach to defining the syntax in this document.

Action: A main goal of this document is to make the syntax clearer and more precise. In particular the grammar section and the pointers to schemas for XML validation help address this.

rdfms-formal-grammar

A formal grammar for RDF.

On 26th October 2001, the WG decided that this issue is closed by the new approach to defining the syntax in this document.

Action: A main goal of this document is to make the syntax clearer and more precise. In particular the grammar section and the pointers to schemas for XML validation help address this.

rdfms-rdf-names-use

On 30th November 2001, the WG decided that this issue was closed by the following resolution.

Action: The use of rdf:RDF, rdf:ID, rdf:about, rdf:resource, rdf:bagID, rdf:parseType, rdf:aboutEach and rdf:li except as reserved names as specified in the grammar is an error. [Later rdf:aboutEach was removed from the language on 7th December 2001]

rdfms-abouteach

processing rdf:aboutEach requires a processing of sub-property relations.

On 7th December 2001, the WG decided to remove rdf:aboutEach from the language on the grounds it is not widely used, it is not widely implemented correctly, it has confusing interactions with bagID as recorded in rdfms-abouteach, it does not scale as parsers have to save state, this is the wrong layer in which to implemenent such functionality. (FIXME: This is an unofficial record of the resolution until the issue list updated)

Action: Removed from the grammar.

rdfms-propElt-id-with-dr

On 7th December 2001, the WG decided to remove rdf:aboutEach from the language and consequently this issue was closed.

Action: None needed.

A.4: RDF Core WG Postponed Issues affecting RDF/XML Syntax (Non-Normative)

rdfms-quoting

The syntax needs a more convenient way to express the reification of a statement.

On 26th October 2001, the WG decided that this issue was postponed for consideration by a future working group.

Action: None required.

rdfms-qnames-cant-represent-all-uris

The RDF XML syntax cannot represent all possible Property URI's.

On 26th October 2001, the WG decided that this issue was postponed for consideration by a future working group.

Action: None required.

rdfms-qnames-as-attrib-values

Suggestion that Qnames should be allowed as values for attributes such as rdf:about.

On 26th October 2001, the WG decided that this issue was postponed for consideration by a future working group.

Action: None required.

rdfms-syntax-incomplete

The RDF/XML syntax can't represent an an arbritary graph structure.

On 26th October 2001, the WG decided that this issue was postponed for consideration by a future working group.

Action: None required.

B Syntax Schemas (Non-Normative)

Two schema language authors submitted schemas for RDF/XML based on the revised grammar in the previous version of this draft. We include pointers to these schemas for information purposes and an example schema; they are not part of this specification.

B.1 RELAX NG Schema - Non XML (Non-Normative)

This is an example schema in RELAX NG's non-XML format (for ease of reading) but applications should use the standard XML version. These formats are described in RELAX NG ([RELAXNG]) and RELAX NG Non-XML Syntax ([RELAXNG-NX]).

RELAX NG Schema (Non-XML) for RDF/XML
#
# RELAX NG Schema (non-XML) for RDF/XML Syntax
#
# This schema is for information only and NON-NORMATIVE
#
# It is based on one originally written by James Clark in
# http://lists.w3.org/Archives/Public/www-rdf-comments/2001JulSep/0248.html
# and updated with later changes.
#

namespace local = ""
namespace rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
datatypes xsd = "http://www.w3.org/2001/XMLSchema-datatypes"

start = doc
doc = 
  RDF

RDF =
  element rdf:RDF { nodeElementList }

nodeElementList = 
  nodeElement*

  # Should be something like:
  #  ws* , (  nodeElement , ws* )*
  # but RELAXNG does this by default, ignoring whitespace separating tags.

nodeElement =
  element * - (local:*
               |rdf:RDF
	       |rdf:ID|rdf:about
	       |rdf:bagID|rdf:parseType|rdf:resource
               |rdf:li ) {
      (idAttr | aboutAttr )?, bagIdAttr?, propertyAttr*, propertyEltList
  }

  # FIXME: Not sure if it is possible to say "and not things
  # beginning with _ in the rdf: namespace".

ws = 
  " "

  # Not used in this RELAX NG schema; but should be any legal XML
  # whitespace defined by http://www.w3.org/TR/2000/REC-xml-20001006#NT-S


propertyEltList = 
  propertyElt*

  # Should be something like:
  #  ws* , ( propertyElt , ws* )*
  # but RELAXNG does this by default, ignoring whitespace separating tags.

propertyElt = 
  resourcePropertyElt | 
  literalPropertyElt | 
  parseTypeLiteralPropertyElt |
  parseTypeResourcePropertyElt |
  parseTypeOtherPropertyElt |
  emptyPropertyElt

resourcePropertyElt = 
  element * - (local:*
	       |rdf:RDF|rdf:Description
	       |rdf:ID|rdf:about
	       |rdf:bagID|rdf:parseType|rdf:resource) {
      idAttr?, nodeElement
  }

literalPropertyElt =
  element * - (local:*
               |rdf:RDF|rdf:Description
	       |rdf:ID|rdf:about
	       |rdf:bagID|rdf:parseType|rdf:resource) {
      idAttr?, text 
  }

parseTypeLiteralPropertyElt = 
  element * - (local:*
               |rdf:RDF|rdf:Description
               |rdf:ID|rdf:about
               |rdf:bagID|rdf:parseType|rdf:resource) {
      idAttr?, parseLiteral, literal 
  }

parseTypeResourcePropertyElt = 
  element * - (local:*
               |rdf:RDF|rdf:Description
               |rdf:ID|rdf:about
               |rdf:bagID|rdf:parseType|rdf:resource) {
      idAttr?, parseResource, propertyEltList
  }

parseTypeOtherPropertyElt = 
  element * - (local:*
               |rdf:RDF|rdf:Description
               |rdf:ID|rdf:about
               |rdf:bagID|rdf:parseType|rdf:resource) {
      idAttr?, parseOther, any
  }

emptyPropertyElt =
   element * - (local:*
                |rdf:RDF|rdf:Description
                |rdf:ID|rdf:about
		|rdf:bagID|rdf:parseType|rdf:resource) {
       (idAttr | resourceAttr)?, bagIdAttr?, propertyAttr* 
   }

idAttr = 
  attribute rdf:ID { 
      IDsymbol 
  }

aboutAttr = 
  attribute rdf:about { 
      URI-reference 
  }

bagIdAttr = 
  attribute rdf:bagID {
      IDsymbol
  }

propertyAttr = 
  attribute * - (local:* 
                 |rdf:RDF|rdf:Description
                 |rdf:ID|rdf:about
		 |rdf:bagID|rdf:parseType|rdf:resource) {
      string
  }

resourceAttr = 
  attribute rdf:resource {
      URI-reference 
  }

parseLiteral = 
  attribute rdf:parseType {
      "Literal" 
  }

parseResource = 
  attribute rdf:parseType {
      "Resource" 
  }

parseOther = 
  attribute rdf:parseType {
      text
  }

URI-reference = 
  string

literal =
  any

IDsymbol = 
  xsd:NMTOKEN

any =
  mixed { element * { attribute * { text }*, any }* }

B.2 Other Syntax Schemas (Non-Normative)

Two schema language authors submitted schemas for RDF/XML based on the new grammar in the previous version of this draft. We include pointers to these schemas for information purposes; they are not part of this specification.

C Original Grammar

This section contains the EBNF grammar of the RDF/XML syntax from RDF Model & Syntax Formal Grammar for RDF section. The only changes made here were to make it legal XHTML via tidy and to change the links to the productions to point to those in the original document.

  [6.1] RDF            ::= ['<rdf:RDF>'] obj* ['</rdf:RDF>']
  [6.2] obj            ::= description | container
  [6.3] description    ::= '<rdf:Description' idAboutAttr? bagIdAttr? propAttr* '/>'
                         | '<rdf:Description' idAboutAttr? bagIdAttr? propAttr* '>'
                                propertyElt* '</rdf:Description>'
                         | typedNode
  [6.4] container      ::= sequence | bag | alternative
  [6.5] idAboutAttr    ::= idAttr | aboutAttr | aboutEachAttr
  [6.6] idAttr         ::= ' ID="' IDsymbol '"'
  [6.7] aboutAttr      ::= ' about="' URI-reference '"'
  [6.8] aboutEachAttr  ::= ' aboutEach="' URI-reference '"'
                         | ' aboutEachPrefix="' string '"'
  [6.9] bagIdAttr      ::= ' bagID="' IDsymbol '"'
 [6.10] propAttr       ::= typeAttr
                         | propName '="' string '"' (with embedded quotes escaped)
 [6.11] typeAttr       ::= ' type="' URI-reference '"'
 [6.12] propertyElt    ::= '<' propName idAttr? '>' value '</' propName '>'
                         | '<' propName idAttr? parseLiteral '>'
                               literal '</' propName '>'
                         | '<' propName idAttr? parseResource '>'
                               propertyElt* '</' propName '>'
                         | '<' propName idRefAttr? bagIdAttr? propAttr* '/>'
 [6.13] typedNode      ::= '<' typeName idAboutAttr? bagIdAttr? propAttr* '/>'
                         | '<' typeName idAboutAttr? bagIdAttr? propAttr* '>'
                               propertyElt* '</' typeName '>'
 [6.14] propName       ::= Qname
 [6.15] typeName       ::= Qname
 [6.16] idRefAttr      ::= idAttr | resourceAttr
 [6.17] value          ::= obj | string
 [6.18] resourceAttr   ::= ' resource="' URI-reference '"'
 [6.19] Qname          ::= [ NSprefix ':' ] name
 [6.20] URI-reference  ::= string, interpreted per [URI]
 [6.21] IDsymbol       ::= (any legal XML name symbol)
 [6.22] name           ::= (any legal XML name symbol)
 [6.23] NSprefix       ::= (any legal XML namespace prefix)
 [6.24] string         ::= (any XML text, with "<", ">", and "&" escaped)
 [6.25] sequence       ::= '<rdf:Seq' idAttr? '>' member* '</rdf:Seq>'
                         | '<rdf:Seq' idAttr? memberAttr* '/>'
 [6.26] bag            ::= '<rdf:Bag' idAttr? '>' member* '</rdf:Bag>'
                         | '<rdf:Bag' idAttr? memberAttr* '/>'
 [6.27] alternative    ::= '<rdf:Alt' idAttr? '>' member+ '</rdf:Alt>'
                         | '<rdf:Alt' idAttr? memberAttr? '/>'
 [6.28] member         ::= referencedItem | inlineItem
 [6.29] referencedItem ::= '<rdf:li' resourceAttr '/>'
 [6.30] inlineItem     ::= '<rdf:li' '>' value </rdf:li>'
                         | '<rdf:li' parseLiteral '>' literal </rdf:li>'
                         | '<rdf:li' parseResource '>' propertyElt* </rdf:li>'
 [6.31] memberAttr     ::= ' rdf:_n="' string '"' (where n is an integer)
 [6.32] parseLiteral   ::= ' parseType="Literal"'
 [6.33] parseResource  ::= ' parseType="Resource"'
 [6.34] literal        ::= (any well-formed XML)

(Note: there are EBNF bugs in the 6.30 production where the </rdf:li> tags are not fully enclosed in quotes as '</rdf:li>')

D Updated Grammar after RDF Core decisions

This section updates the original grammar in Appendix C by amending and deleting various productions according to the recorded RDF Core WG decisions. Some productions are also removed since they are no longer needed, once the above changes are made.

Key:
This text should be added If it is not, your browser will not display this section properly.
This text should be deleted. If it is not, your browser will not display this section properly.

Updated RDF/XML grammar productions
Production
Number
Production
Name
Definition
6.1 RDF "<rdf:RDF>" obj description* "</rdf:RDF>"
| description
6.2 obj description | container
6.3 description "<rdf:Description" idAboutAttr? bagIdAttr? propAttr* "/>"
| "<rdf:Description" idAboutAttr? bagIdAttr? propAttr* ">"
propertyElt* "</rdf:Description>"
| typedNode
6.4 container sequence | bag | alternative
6.5 idAboutAttr idAttr | aboutAttr | aboutEachAttr
6.6 idAttr " rdf:ID=\"" IDsymbol "\""
6.7 aboutAttr " rdf:about=\"" URI-reference "\""
6.8 aboutEachAttr " rdf:aboutEach=\"" URI-reference "\""
| " aboutEachPrefix=\"" string "\""
6.9 bagIdAttr " rdf:bagID=\"" IDsymbol "\""
6.10 propAttr typeAttr
| propName "=\"" string "\"" (with embedded quotes escaped)
6.11 typeAttr " rdf:type=\"" URI-reference "\""
6.12 propertyElt "<" propName idAttr? ">" value "</" propName ">"
| "<" propName idAttr? parseLiteral ">"
literal "</" propName ">"
| "<" propName idAttr? parseResource ">"
propertyElt* "</" propName ">"
| "<" propName idRefAttr? bagIdAttr? propAttr* "/>"
6.13 typedNode "<" typeName idAboutAttr? bagIdAttr? propAttr* "/>"
| "<" typeName idAboutAttr? bagIdAttr? propAttr* ">"
propertyElt* "</" typeName ">"
6.14 propName Qname
6.15 typeName Qname
6.16 idRefAttr idAttr | resourceAttr
6.17 value obj description | string
6.18 resourceAttr " rdf:resource=\"" URI-reference "\""
6.19 Qname [ NSprefix ":" ] name
6.20 URI-reference string, interpreted per [URI]
6.21 IDsymbol any legal XML name symbol
6.22 name any legal XML name symbol
6.23 NSprefix any legal XML namespace prefix
6.24 string any XML text, with "<", ">", and "&" escaped
6.25 sequence "<rdf:Seq" idAttr? ">" member* "</rdf:Seq>"
| "<rdf:Seq" idAttr? memberAttr* "/>"
6.26 bag "<rdf:Bag" idAttr? ">" member* "</rdf:Bag>"
| "<rdf:Bag" idAttr? memberAttr* "/>"
6.27 alternative "<rdf:Alt" idAttr? ">" member+ "</rdf:Alt>"
| "<rdf:Alt" idAttr? memberAttr? "/>"
6.28 member referencedItem | inlineItem
6.29 referencedItem "<rdf:li" resourceAttr "/>"
6.30 inlineItem "<rdf:li" ">" value </rdf:li>"
| "<rdf:li" parseLiteral ">" literal </rdf:li>"
| "<rdf:li" parseResource ">" propertyElt* </rdf:li>"
6.31 memberAttr " rdf:_n=\"" string "\"" (where n is an integer)
6.32 parseLiteral " rdf:parseType=\"Literal\""
6.33 parseResource " rdf:parseType=\"Resource\""
6.34 literal any well-formed XML