RE: SYNTAX: RDF Syntax Telecon Friday

[ XSLT translation RDF/XML to ntriples is the real subject here ]

On Tue, 2 Oct 2001, Jeremy Carroll wrote:

>
> More substantive discussion.
>
> Art said:
> > I believe an automatic [golden] triple generator mechanism/program
> > could be the WG's most significant deliverable wrt RDF interop and
> > ultimately its long-term success.
>
> It is unclear whether DaveB's message sees the golden triple or a spec as
> the objective.
>
> I think that the spec is much more important than the golden triple.
> Any golden triple may be buggy; and is at some point, just another
> implementation.

I'd agree.

And now for something completely the same: an XSLT approach to ntriples
generation- [aimed at the syntax mob really]


The difference between XSLT validation and spitting out of ntriples is
that the latter is harder. Principally (and I've just begun sketching
this) because we've a requirement to output approprately unique
identifiers for anonymous nodes when producing RDF. The problem with a
single-step RDF/XML -> ntriples is that it needs to produce genids
(effectively) for anonymous nodes as it goes; and since XSLT is
completely applicative there seems to be no way to pass the small amount
of state around needed to do that apart from the usual obfuscating
tricks that Prolog, etc. would use to chain state-modifying operations
together:

	some_operation(Params, State_in, State_out) :-
		sub_operation(ParamSubset1, State_in, State'),
		sub_op_2(ParamSubset2, State', State_out).

I've just begun hacking around with an intermediate form that relies on
the fact that in RDF/XML, the only anonymous nodes that can be involved
in a statement are the most recent two encountered, modulo the pushing
and popping of subelements (ie, it follows the nesting of elements).

More strongly, the most recent anon node encountered (in a top-down
reading of the RDF/XML) is normally the only anon node referred to, and
only as a subject; the only exception seems to be when it is the object
of a statement involving the previous anon node. That sounds odd, but
see below...

So we can use an (XMLish) intermediate form that resembles a stream of
ntriples interspersed with anonymous-node-identifier stack operations.

To illustrate -

<rdf:Description>
  <foo:name>First anon node</foo:name>
  <foo:bar>
     <foo:Baz>
        <foo:name>Innermost anon node</foo:name>
     </foo:Baz>
  </foo:bar>
  <foo:name2>Still first anon node</foo:name2>
</rdf:Description>
<rdf:Description>
  <foo:name>Third anon node</foo:name>
</rdf:Description>

could produce an intermediate form like this (suitably XMLised)

[operation list begins]

generate_and_push_anon_id	# stack: TOP-> _:a1

sentence ( top_anon_id, <foo:name>, "First anon node" )

generate_and_push_anon_id	# stack: TOP-> _:a2 _:a1

sentence ( top_but_one, <foo:bar>, top_anon_id )
sentence ( top_anon_id, <foo:name>, "Innermost anon node" )

pop_top_anon_id			# stack: TOP-> _:a1

sentence ( top_anon_id, <foo:name2>, "Still first anon node" )

pop_top_anon_id			# stack: TOP->
generate_and_push_anon_id	# stack: TOP-> _:a3

sentence (top_anon_id, <foo:name>, "Third anon node" )

pop_top_anon_id			# stack: TOP->

[operation list ends]

which can (I hope) be passed to an XSLT ntriples-production
transformation; XSLT has sufficient functions to reasonably efficiently
"fake" sufficient state to handle this as a small language for producing
ntriples in a top-down manner.

The other problem I've got is that XSLT is strongly write-only. It _is_
a W3C technology, but it's a royal PITA to come back to it after, I
dunno, 5 minutes and try to figure out what you were up to :-(

jan

-- 
jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
Tel +44(0)117 9287088 Fax +44 (0)117 9287112 RFC822 jan.grant@bris.ac.uk
Generalisation is never appropriate.

Received on Tuesday, 2 October 2001 10:16:57 UTC