Re: Atom/RDF XSLT and GRDDL from David Powell on 2007-08-23 (public-grddl-comments@w3.org from July to September 2007)

From: David Powell <djpowell@djpowell.net>
Date: Thu, 23 Aug 2007 13:17:49 +0100
To: "Danny Ayers" <danny.ayers@gmail.com>
CC: webmaster@kanzaki.cc, "Story Henry" <henry.story@bblfish.net>, "Dave Beckett" <dave@dajobe.org>, <public-grddl-comments@w3.org>
Message-ID: <208674577.20070823131749@djpowell.net>
Wednesday, August 22, 2007, 3:31:05 PM, you wrote:

> [cc'ing public-grddl-comments@w3.org]

> Hi,

> The GRDDL Working Group [1] is looking for an XSLT transformation
> which will convert Atom format to RDF/XML. It's hoped that if a
> suitable one is found/written, the Atom WG will be willing for it to
> be associated with the Atom namespace document, enabling GRDDL-aware
> agents to automatically interpret Atom data as RDF. Please note that
> this hasn't yet been raised with atompub, it was felt better to have
> the XSLT available first (along with anything else that might improve
> the case, like tests).

Does atompub actually exist anymore? I know the mailing list is still
active, but do IETF WGs effectively dissolve after they've published
their work? Obviously it would be good to discuss things with them,
but can they actually say anything definitive?

> The current intention is for the XSLT file to be hosted on a W3C
> server, available under the W3C license [2].

> The GRDDL WG would very much appreciate your opinions on this matter, e.g.
> * What criteria do you believe the XSLT should fulfil?
> * Are any of these existing versions suitable, or do we need a new one?
> * Do you have any suggestions for tests etc?

> (Dave, I don't know if you've done an XSLT, but your work on Atom &
> RSS 1.0 around Raptor suggests your opinion would be valuable).

I guess before we get carried away with ourselves, we should be asking
what we want from the vocabulary rather than the what we want from the
transform.

Personally, I have the following 'wants':

 * Round-trippable.  Obviously minor differences are allowed, but
   nothing semantically significant should be lost.

 * Full support for extension elements.  Simple vs Structured
   Extension Elements were designed to support RDF, but what we ended
   up with doesn't really help anyone, so I wouldn't pay much
   attention to the difference between the two.

 * Triples shouldn't smush into mush when two feed documents polled at
   different times are combined.

   I think that this is fairly important because of the fact that
   feeds are all about changing data, so I think that it is useful to
   support merging of feed documents (RSS 1.0 doesn't)

 * No expectancy that the software should have to perform any kind of
   general purpose inference.

 * Dual OWL/RDFS schema. Is it just me that prefers RDFS? I'm
   more of a fan of RDF-the light-weight data model, than RDF-the heavy
   weight stack.

   I attempted this with my vocab, but it hasn't been maintained and
   is probably broken now. I've never really seen it done anywhere -
   but I can't see why it shouldn't work.

 * Design decisions of vocabulary should all be justified. I've been
   meaning to do this to my vocabulary, but haven't got around to it.

 * It should be possible to represent RSS feeds using the same
   vocabulary.  We don't need to think about the transform right now,
   and they may require supplementary terms, but it should be doable.
   I have added support for major RSS versions to my transform,
   although they may still require a bit of work.

 * I'd prefer an XSLT 1.0 solution.  XSLT 2.0 doesn't seem as widely
   deployed yet.
   
> Henry, I believe your current version [3] requires XSLT 2.0 - for
> GRDDL purposes this limits its applicability. David, I couldn't find
> your latest version...[4] was nearest.

I haven't played with it for a while. I moved to working on
http://djpowell.net/schemas/treetriples/1/ cause it seems a more
irritating problem.

I thought that http://djpowell.net/atomrdf/ was the latest, but
perhaps it isn't.  I'll check that out.

My transform has some weirdness in it, a) it lets you plug-in rules
for processing extensions, which probably isn't required for GRDDL,
and b) it has RSS stuff in there which isn't required.

> So Masahide, yours [5] currently looks most promising. It would be
> helpful if everyone could confirm their licensing situation.

> An open question is the target vocabularies to which the XSLT should
> translate. (If I remember correctly, Henry's is a new Atom-specific
> vocab, David's mostly RSS 1.0 based, Masahide's RSS 1.0 augmented with
> Atom-specific terms).

Mine is a new Atom-specific vocab, although there is a supplemental
vocab containing RSS compatability terms.

> Are the differences between Atom & RSS 1.0 such that some/all of the
> later would be too much of a compromise? Or are some/all of the terms
> near enough that the value of term reuse more than compensates for
> minor differences?

I don't think that we should use the RSS1.0 vocab.  Too much
information would be lost, and I don't like giving inherently
changable values like entries, a fixed URL, cause that makes the data
un-smush-able.


Back to the point about justification for all of the design decisions.
I think that this would be a good way forward.  We have several vocabs
at the moment.  The actual terms and URLs are largely irrelevant.  It
would be easier to compare them if we had a bullet point list of
design decisions for each of them.  Once we've got the vocab sorted,
we can think about the implementation of the XSLT.


-- 
Dave
Received on Thursday, 23 August 2007 12:18:16 UTC