- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Thu, 3 Feb 2005 18:12:10 +0100
- To: www-rdf-interest@w3.org
I sent the mail below to a list set up by folks working on common libs for syndication in Java a few days ago. Following discussion in the "missing bit of RDF for XML people" thread, I reckon this is now close enough to topic to avoid crosspost annoyance ;-) ---------- Forwarded message ---------- From: Danny Ayers <danny.ayers@gmail.com> Date: Mon, 31 Jan 2005 11:01:26 +0100 Subject: Syndication model convergence and RDF To: java-syndication@yahoogroups.com Cc: atom-owl@googlegroups.com ... It's really good to see a place for dialog between the groups and individuals working on syndication/Java. A common object representation of feeds seems a worthy goal, as do interoperable interfaces. I've been wondering about the same kind of issues through RDF-tinted spectacles, and it seems to me there is also a lot of potential for unification of the diverse formats through mapping to, and modelling with Semantic Web technologies. Such a mapping/modelling could be entirely compatible with and complementary to the Java work. The way I imagine it could work would be to define a top-level (but fairly loose) RDF Schema/OWL model of syndication data giving the relationships between the key constructs: feedlists (channels/blogrolls), entries, content and associated metadata. RDF "profiles" could then be derived for the various formats, relating constructs specific to individual languages (RSS 1.0, 1.1, 0.91, 2.0, Atom) to the top-level view and by inference to each other. The general approach would, I'd suggest, be one of looking at what's already out there and pulling it together in a cohesive way, rather than building anything from scratch. Before going any further I should say that I wouldn't expect a total 'one rule to bind them' approach. Just a common single base-level model of structures, and could potentially enable transparent interop between syndication and RDF systems. There are aspects which I would expect to remain out of scope in development of a common model - in particular modelling the /engines/ of syndication: HTTP serving/caching and the polling subsystem. Work that has been done on versioning of entries (particularly by Henry, see below) suggest that may be something that is best left until after a core model is in place. Provenance more sophisticated than dc:source also brings with it complications that may place it beyond 80/20 requirements for RDF-oriented interop. There have been recent developments which I reckon bring the idea of a common RDF view of syndication within relatively easy reach, notably the work around the Atom community to develop a normative mapping for Atom feed data to an RDF/OWL model. So what have we got so far? Pretty near everything required, only it's scattered about the Web. (I believe Kevin has done work in this area with NewsMonster, but I must confess I never looked closely at what he'd done there). Elsewhere: One direct practical technique is to use XSLT for syntax-based mapping from syndication format to RDF/XML. Stylesheets have been done to normalise "any" feed format to RSS 1.0 (e.g. Morten Frederikson's at [1]). A new Atom-specific translation has just been created [2]. A while ago I suggested [3] an approach to using XSLT to disambiguate RSS 2.0 information (in particular extensions). With slightly different implementation details, the GRDDL (Gleaning Resource Descriptions from Dialects of Languages) [4] technique offers a more generally useful approach, and has W3C-blessing. But the results of such transformations are only minimally defined (resources, properties, literals) without a schema/ontology. A similar kind of mapping is implied by the Redland/Raptor toolkit, which can parse Atom/RSS/tag soup into RDF model(s), I think similar input stages may also be available for Jena (I've done it with Jena myself, only pre-processing with XSLT). There has been a basic schema for RSS 1.0 all along, but this has been tidied up considerably for RSS 1.1 [5], and I believe the intention there is to provide an OWL DL model. I started some work on Atom/OWL [6] then got distracted by day-job, but Henry Story picked it up from there and worked through a lot of possible Atom/OWL models. His motivation at first at least was a model to use with the Java BlogEd [7] authoring/posting application. There's a version of Henry's OWL ontology for Atom on the Wiki [8], his implementation work (in progress) can be found on a blog at [9]. I should mention the current feedlist/channels/blogroll representation - OPML seems to be the de facto standard though is another essentially incompatible format. The Technorati folk favour XOXO [10] (they also have Attention.xml which also represents data in this domain). OCS offers one RDF-based model, FOAF blogrolls [11] another (there are XSLTs between some of these, only I've got tired of finding the links ;-) At the same time it would be useful to try and take a common approach to other bits of modelling that haven't quite yet congealed - the del.icio.us/Flickr/Technorati tags for example. How might this tie in with the Java work, and what would be the benefits? In the first place it would help with translations between formats, providing a sound formal base. Syndication feeds of all kinds could be arbitrarily rich and still be comprehensible by off-the-shelf APIs. Module/vocab authors would be able to get cross-format compatibility from day one. RDF systems (with a little translation) could compatibly consume and produce feed data. What's needed? Somewhere to talk about this stuff - Henry's set up an Atom/OWL list at [12] and material has gone onto the Atom and ESW Wikis. What's also needed is some for of coordination point - this stuff is pretty spread well about the place at present - I'd be happy to host a Wiki or whatever if needed. Deliverables I'd say include the top-level model, individual format mappings (especially XSLT), tests (especially a validator). Cheers, Danny. [0] http://dannyayers.com/archives/2005/01/31/syndication-model-convergence-and-rdf/ [1] http://purl.org/net/syndication/subscribe/feed-rss1.0.xsl [2] http://www.imc.org/atom-syntax/mail-archive/msg12615.html [3] http://www.xml.com/pub/a/2003/07/23/extendingrss.html [4] http://www.w3.org/2004/01/rdxh/spec [5] http://inamidst.com/rss1.1/ [6] http://semtext.org/atom/ [7] https://bloged.dev.java.net/ [8] http://www.intertwingly.net/wiki/pie/AtomOWL [9] http://bblfish.net/work/atom-owl/2004-08-12/blogexample.html [10] http://developers.technorati.com/wiki/xhtmloutlines [11] http://www-106.ibm.com/developerworks/xml/library/x-pblog/ [12] http://groups-beta.google.com/group/atom-owl -- http://dannyayers.com -- http://dannyayers.com
Received on Thursday, 3 February 2005 17:12:11 UTC