- From: Yves Raimond <yves.raimond@gmail.com>
- Date: Fri, 20 Nov 2009 14:46:20 +0000
- To: Niklas Lindström <lindstream@gmail.com>
- Cc: nathan@webr3.org, Georgi Kobilarov <georgi.kobilarov@gmx.de>, public-lod@w3.org
Hello! Back in April, we had a similar discussion: http://lists.w3.org/Archives/Public/public-lod/2009Apr/0130.html Concretely, we are having exactly the same problem for syncing up aggregations of BBC RDF data (Talis's and OpenLink's), as our data changes *a lot*. Right now, we're thinking about a really simple feed, detailing a) if a change event is a delete, an update or a create and b) what thing has changed. That's a start, but should be enough to sync up with our data. Cheers, y 2009/11/18 Niklas Lindström <lindstream@gmail.com>: > Hi Nathan! > > 2009/11/17 Nathan <nathan@webr3.org>: >> very short non-detailed reply from me! > > I appreciate it. > >> pub/sub, atom feeds, RDF over XMPP were my initial thoughts on the >> matter last week - essentially triple (update/publish) streams on a >> pub/sub basis, decentralized suitably, [snip] >> >> then my thoughts switched to the fact that RDF is not XML (or any other >> serialized format) so to keep it non limited I guess the concept would >> need to be specified first then implemented in whatever formats/ways >> people saw fit, as has been the case with RDF. > > I agree that the concept should really be format-independent. But I > think it has to be pragmatic and operation-oriented, to avoid "never > getting there". > > Atom (feed paging and archiving) is basically designed with exactly > this in mind, and it scaled to my use-cases (resources with multiple > representations, plus opt. "attachments"), while still being simple > enough to work for "just RDF updates". The missing piece is the > deleted-entry/tombstone, for which there is thankfully at least an > I-D. > > Therefore modelling the approach around these possibilities required a > minimum of invention (none really, just some wording to descibe the > practise), and it seems suited for a wide range of dataset syndication > scenarios (not so much real-time, where XMPP may be relevant). > > At least this works very well as long as the datasets can be sensibly > partitioned into documents (contexts/"graphs"). But this is IMHO is > the best way to manage RDF anyhow (not the least since one can also > leverage simple REST principles for editing; and since > quad-stores/SPARQL-endpoints support named contexts etc). > > But I'd gladly discuss the benefit/drawback ratio of this approach in > relation to our and others' scenarios. > > (I do think it would be nice to "lift" the resulting timeline to > proper RDF -- e.g. AtomOwl (plus a Deletion for tombstones, provenance > and logging etc). But these rather complex concepts -- datasources > (dataset vs. collection vs. feed vs. page), timelines (entries are > *events* for the same resource over time), "flat resource manifest" > concepts, and so on -- require semantic definitions which will > probably continue to be debated for quite some time! Atom can be > leveraged right now. After all, this is a *very* instrumental aspect > for most domains.) > > >> this subject is probably not something that should be left for long >> though.. my (personal) biggest worry about 'linked data' is that junk >> data will be at an all time high, if not worse, and not nailing this on >> the head early on (as in weeks/months at max) could contribute to the >> mess considerably. > > Couldn't agree with you more. A common, direct (and "simple enough") > way of syndicating datasets over time would be very beneficial, and > shared practises for that seems to be lacking today. > > COURT <http://purl.org/net/court> is publically much of a strawman > right now, but I would like to flesh it out. Primarily regarding the > use of Atom I've described, but also with details of our > implementation (the swedish legal information system), concerning > collection and storage, proposed validation and URI-minting/verifying > strategies, "lifting" the timeline for logging etc. > > (In what form and where the project's actual source code will be > public remains to be decided (though opensourcing it has always been > the official plan). Time permitting I will push my own work in the > same vein there for reuse and reference. Regardless I trust the > approach to be simple enough to be implementable from reading this > mail-thread alone. ;) ) > > Best regards, > Niklas Lindström > >
Received on Friday, 20 November 2009 14:46:55 UTC