Re: RDF, XML, XSLT: Grit from Niklas Lindström on 2010-01-19 (semantic-web@w3.org from January 2010)

From: Niklas Lindström <lindstream@gmail.com>
Date: Tue, 19 Jan 2010 01:23:15 +0100
To: Reto Bachmann-Gmuer <reto.bachmann@trialox.org>
Cc: Semantic Web <semantic-web@w3.org>, Linked Data community <public-lod@w3.org>
Message-ID: <cf8107641001181623w2f325429n70881c5aa584bc8b@mail.gmail.com>
Hi Reto!

I have looked at some RDF/XML-normalizers before, but in all honesty I
haven't tested R3X-transform very extensively. I have taken some care
in assuring that the Grit XSLT handles very "raw" RDF/XML -- in my
case using the "non-pretty" serializer of Sesame (on a graph
consisting of multiple RDFS and OWL vocabularies). I have also tested
it on the output of RDFLib. Remaining is to try it out extensively on
all kinds of RDF in the wild, such as SPARQL construct results, and
feeds of RDF/XML (see also about Atom below). The relative merits have
yet to be assessed.

In terms of features, it certaibly groups by subject, normalizes type
statements etc. Predictability is quite paramount. The current XSLT
takes some rudimentary care of e.g. @xml:base, but not fully.
Basically, if the input consistently uses relative references, the
result will be predictable. This could certainly be improved more
though (see the source [1] for details).)


= Some Details =

Grit is a different format. This makes it clear that any XSLT (or
other code) which uses the output isn't intended to transform RDF/XML.
AFAIK, there is no way -- apart from inventing some hint -- to
determine if an RDF/XML is "cleaned up".

I've strived to make transforming Grit as easy as (reasonably)
possible. This was part of the reason for it  -- I needed a way to
make a normalized "linked tree" from a graph, and it seemed just as
well to take it beyond RDF/XML entirely. (As mentioned, even if it
never becomes an established serialization, it could always at least
be GRDDL:ed "back".)

Even if "normalizing" (for some definition of "normalized") RDF/XML
makes it somewhat easier to transform, I opted to make some
expressions more compact or easier to handle. Since XPath uses the
namespace context, using elements for *all* types of a resource
(within an <a> element) makes it simpler to gather all things of a
certain type (by matching on the qname). Grit is definitely all for
namespaces for properties and types (in comparison to e.g. TRiX, RXR,
etc.).

Grit does a similar thing for datatypes -- that is it wraps the value
in an element corresponding to the datatype. I'm not entirely set on
the current @fmt mechanism, marking literals with a datatype "or" xml.
I just felt it made sense to "tag" such literals with an attribute,
meaning the contents should be processed as a value (in part since
@xml:lang is used for language literals). An alternative would be to
use e.g. <dt> and <xml> element wrappers, but I'm staying with @fmt
until otherwise convinced. ;)

.. I did, at one time, consider gratuitously just "overloading"
RDF/XML with e.g.:

    <rdf:type rdf:parseType="Reference"><ex:Thing/></rdf:type>
    <ex:value rdf:parseType="Data"><xsd:string>datatyped
1</xsd:string></ex:value>

, but I felt that that would be, well, gratuitous.. (and gritty in a
rather bad way..)

Lists in Grit use null-namespaced <li> elements, corresponding to the
content of regular <resource> elements (either carrying @ref
attributes or representing nested bnodes). Again, my gut feeling is
that this is intuitive enough. (This, of course, is the
@parseType="Collection" alternative.)

Grit *disallows* using bnodes as *objects* of multiple statements
(mainly because I find bnodes used in this manner quite annoying..).
All properties of object ("singly linked") bnodes are just put as
elements within the element corresponding to the property of that
statement (in the @parseType="Resource"-style). Non-linked bnodes are
put as top-level <resource> elements sans @uri.

(Well, the current XSLT actually *copies* the descriptions of bnodes
used more than once. :P This is something I need to hammer out
edge-cases for. Regarding a bnode used a multiple objects, I'm not so
sure, but of course I'll rework them somehow if real need would arise
(e.g. actually adding e.g. @id and @idref..).)


= Graphs, Atom =

Grit can certainly be extended to allow for @uri in the root <graph>
element, or even allow nested <graph> elements (which of course would
make transforming it a bit more involved). Naturally, <resource> could
also be allowed as a root element as well (just as Atom entry
documents work).

I do think the format is a bit reminiscent of Atom, albeit there are
important differences (some mentioned at [2]).

(If I may aim high, I'd even say that if such graph elements would
carry, say, <created> and <updated> .., we'd have timestamped named
graph support. Which incidentally is what I think that Atom entries
may "actually" represent.. *Describing* those graphs in detail could
be done with either <resource> elements within them
("self::resource/@uri = parent::graph/@uri"), or as property elements
directly in the <graph>.. I haven't iterated on these ideas much yet
though. Of course, striving to replace Atom *as well* as a data format
may be just a bit too bold; but, well, that's what grit is for.. ;D)


= Further =

Grit is very much a young work in progress. I'd be happy to
collaborate on it if others see the same potential as I do.

It is really a very simple thing, and certainly should pay homage to
the other existing alternatives. The point is that stepping this short
step away from RDF/XML, to some extent closer to what Atom *is used
for*, makes it IMHO look like that "missing format", being true (to)
RDF while at the same time quite usable as a dead simple "Linked Data
XML", for those caring less for the merits of RDF.. (But still
accepting the XML tax*, of course.)

With all this said, even if no one else finds these higher goals
promising in this form, I hope Grit is still very usable as an
instrument for XSLT:ing RDF.

Best regards,
Niklas

[1]: http://purl.org/oort/impl/xslt/grit/rdfxml-grit.xslt
[2]: http://code.google.com/p/oort/wiki/Grit?ts=1263844538&updated=Grit#Atom

* = For JSON users, see [3] and my take at [4] instead:

[3]: http://code.google.com/p/linked-data-api/
[4]: http://code.google.com/p/oort/wiki/Gluon


2010/1/18 Reto Bachmann-Gmuer <reto.bachmann@trialox.org>:
> I'm wondering what the advantages of grit are when compared with
> simple subsets of RDF/XML than can be used for XSLT transformation,
> e.g. Morten's R3X [1].
>
> Cheers,
> reto
>
> 1. http://www.wasab.dk/morten/blog/archives/2004/05/30/transforming-rdfxml-with-xslt
> (just to allow XSLT the special RSS 1.0 handling can be ignored)
Received on Tuesday, 19 January 2010 00:24:10 UTC