The core processing step (was: Ben's rules:-)

Dear all,

It seems that, at that point, we still have a problem with the basic
processing step in RDFa, the one I always called "Ben's rules"[1]. The
issue that was discussed before was the exact role of @about in chaining
the subjects. Indeed, if one takes the rule as it is today then:

<p>This photo was taken by
<span about="http://www.ex.org" rel="dc:resource">Mark Birbeck</span>.</p>

yields

<http://www.ex.org> dc:resource <http://www.ex.org>
 
which is certainly what we want!

An easy way out would be to take 'out' @about in the processing steps
described in [1] (I think I proposed that at some point, too). That
would indeed generate something like:

<http://www.ex.org> dc:resource []

for the code above, which is of course a bit meaningless for this
example, but correct as far as the rules are concerned.

However, do not rejoice too quickly. If we go down this simple road,
then we do get a strange effect again with:

<div id="me" about="#me" instanceof="foaf:Person">
 <h1 property="foaf:name">Ivan Herman</h1>
 ...
</div>

indeed, this yields (again with the rules of [1])

[ foaf:name "Ivan Herman" ]

while, I presume, what we would like to get is:

<test.xhtml#me> [
  a foaf:Person;
  foaf:name "Ivan Herman".
]

My conclusion is that [1] has to be re-written in a somewhat more
complex manner to ensure the proper handling of instanceof. I played a
bit with pyRdfa, modifying slightly the core processing step in [2], and
I think I have something properly working now. The new version works
along expectations for the examples of this mail; I have also ran it
against all the accepted test cases and it makes no difference so far.

I attach a separate text that tries to describe the algorithm in
details. It is, essentially, paraphrasing [2]. To make it also clearer,
I drew a picture of the processing:

http://www.w3.org/2007/08/RdfaAlgo.svg
http://www.w3.org/2007/08/RdfaAlgo.png

As I said, it worked for me... But I may have missed something obvious!

Cheers

Ivan


[1]
http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Jul/0209.html
[2] http://dev.w3.org/2004/PythonLib-IH/pyRdfa/Parse.py


-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
The algorithm describes the processing step on a specific DOM node. For the sake of simplicity, the issues around proper namespace, xml:base and language handling are ommitted in this description. The preprocessing step for handling <li> elements is also ommitted.

When traversing the DOM tree, each node inherits an i_subject (subject in the RDF triple sense). While processing a node, an intermediate subject may also be generated (I will denote that new_subject). Finally, each child element is invoked recursively, passing on a subject value that will be the inherited subject for that child.

The initial 'inherited' subject for the <html> element is the base URI.

The traversal happens as follows.

1. if the node does not have any of the attributes @about, @resource, @instanceof, @property, @rel, or @rev, then the children are processed passing on i_subject. That is it.

2. in case @about is present, the value of the i_subject is _overwritten_ with the value of @about

3. the @property attribute is handled (possibly with @content) to generate RDF triples with Literals, and i_subject as triple subject

4. if the @rel, @rev, or @resource attributes are present, then a new_subject value is calculated by taking the first value in  priority order: @resource, @href, @src, @data for <object> elements. If none of these are present, then a bnode is generated and is used as a value of new_subject. Then:

	4a: if @rel is present, then the (i_subject,@rel,new_subject) triples are added to the graph (@rel may yield several property values)
	4b: if @rev is present, then the (new_subject,@rev,i_subject) triples are added to the graph (@rev may yield several property values)
	4c: the value of i_subject is set to new_subject
	
5. if @instanceof is present and its value is not equal to "", then the (i_subject,rdf:type,@instancof) values are added to the graph (again, @instanceof may yield several type values)

6. Unless @property is present _without_ @content the children node are processed with i_subject as the inherited subject for the child. Otherwise processing returns.

7. Everybody is happy:-)

Received on Tuesday, 7 August 2007 14:45:04 UTC