Alternative RDF/A Rules

Motivation

The reviewed RDF/A document allowed for a lot of different ways of determining a resource object appropriate for an element. These could be specified using an @href, @about, @id, @nodeID on the element itself, or using an @about, @id, or @nodeID on a child element, or by a generated id from a child or the element itself.

In total the RDF/A document had the following number of rules:

Rule Type	Count
Subject	8
Predicate with resource object	5
Predicate with literal object	3
Resource object	9
Literal object	3

This makes a total of 8*(5*9+3*3)=432 different ways of matching a triple (a few are impossible, because say the subject and object patterns conflict)

To try and propose a variant rule system, with the same main features, I have come up with a different set of rules; where the resource object and subject identification rules are inspired by Adobe's XMP subset of RDF/XML, with its emphasis on rdf:parseType="Resource". The numbers for these rules are as follows:

Rule Type	Count
Subject	8
Predicate with resource object	3
Predicate with literal object	2
Resource object	3
Literal object	5

This makes a total of 8*(3*3+2*5)=152 triples patterns. Also the most complex paragraph 4.4.3 has been dropped, so that these patterns are, in principle, easier to implement.

The numbers above assume an implementation strategy that tries to match subjects, predicates and objects as in the RDF/A description. Since the implementation compared use relatively simple XPath expressions, some of the cases that get combined in the English text get divided out before counting. It is a judgement call that that is appropriate.

Basic Idea

RDF/A faces the same problem as RDF/XML of being reasonably easy to author, in a number of different use cases, for people who may not want to think deeply about the RDF graph, yet still produce a reasonable graph.

RDF/XML suffers too much complexity.

A good effort to reduce this complexity is found in Adobe's XMP. This is particularly interesting, because like RDF/A it is not a striped syntax. XMP uses rdf:parseType="Resource" on every element, so that all non-top-level elements are property elements, describing the object of their parent element (if it has one), or the single identified resource in their parent element (if it is a top-level element).

Since the complexity of RDF/A seems to be in part to do with the objects, and in part to do with the subjects, and in part to do with treating link and meta elements differently from others, we make the following core changes:

If no resource object is specified, then a blank object is generated, associated with the element.
@nodeID attribute to denote object of triple not subject.
If no subject is specified, then the element is taken as describing the object (if any) of its parent element.
If no subject is specified, and the parent element does not describe a triple, but does descibe a single resource (e.g. with an @about attribute), then that resource is the resource being described.

Other changes include:

Allowing in-line literal content, both for typed literals, and for plain literals (using an @plain='true' attribute).
Dropping predicate inheritance
Using two different default predicates for when an object is specified but not a predicate
href alone is not sufficient for default predicate, need href and an about or id.
Added span elements to hold xml:lang
link and meta are not treated specially

Rules

This is loosely modelled on section 4 of RDF/A, but order is changed.

All negative conditions are expressed explicitly.

Identifying Literal Object

Plain literal with @content

If @content attribute and no @datatype, then the @content value is a (potential) plain literal object, with in-scope xml:lang.

Typed literal with @content

If @content attribute and @datatype, then the @content value is lexical form and @datatype is datatype of typed literal object.

Plain in-line literal

If @plain attribute with 'true' then concatenation of text descendent nodes is lexical form of plain literal object, with in-scope xml:lang.

Typed in-line literal

If @datatype attribute and no @content, then concatenation of text descendent nodes is lexical form of typed literal object, @datatype is type.

In-line rdf:XMLLiteral

If no @datatype and no @content and no @plain attributes then Exclusive Canonicalization of element content is lexical form of typed literal object, of type rdf:XMLLiteral.

Resource Objects

@href

@href gives URI for resource object

@nodeID

@nodeID gives blank node identifier for resource object

gensym

if not @href and not @nodeID then blank node identifier is generated (associated with element)

Predicate for Literal Objects

@property attribute

Gives explicit predicate for literal objects

Implicit predicate

If there is an @content, @datatype or @plain attribute, and no @property attribute then there is an implicit property for literal objects of xhtml2:refersToLiteral

Predicate for Resource Objects

@rel attribute

Gives explicit predicate for resource objects

@rev attribute

Gives explicit predicate for resource objects, but reversed

Implicit predicate

If there is an @href or @nodeID and an @about or @id attribute, and no @rel or @rev attribute then there is an implicit property for resource objects of xhtml2:refersToResource

Subjects

@about

Gives URI for subject

@id

If no @about attributes, gives fragID for subject

Subject defined by Parent element

If child has no @about or @id, then subject comes from parent.

Subject from Parent element's explicit object

If parent has @href or @nodeID then that gives subject

Parent element's generated object

If parent has actually used a generated object, because it lacks an explict object, but does have an @rel or @rev, then the bnode with generated-id from the parent, is used as the subject for the child element

Parent element's subject

If parent doesn't match previous cases, i.e. no @href or @nodeID or @rev or @rel, but does have an @about or an @id (but no @about) then the @about or @id is used as the subject of the child triple. This allows idiom like:

   <head about="">
      <link property="dc:creator" content="Jeremy Carroll"/>
   </head>

Gensym from parent

If parent does not match any of the above cases, i.e. no @href, @nodeID, @rev, @rel, @about or @id then a gensym is used.