- From: Niklas Lindström <lindstream@gmail.com>
- Date: Wed, 18 Nov 2009 18:03:03 +0100
- To: nathan@webr3.org
- Cc: Georgi Kobilarov <georgi.kobilarov@gmx.de>, public-lod@w3.org
Hi Nathan! 2009/11/17 Nathan <nathan@webr3.org>: > very short non-detailed reply from me! I appreciate it. > pub/sub, atom feeds, RDF over XMPP were my initial thoughts on the > matter last week - essentially triple (update/publish) streams on a > pub/sub basis, decentralized suitably, [snip] > > then my thoughts switched to the fact that RDF is not XML (or any other > serialized format) so to keep it non limited I guess the concept would > need to be specified first then implemented in whatever formats/ways > people saw fit, as has been the case with RDF. I agree that the concept should really be format-independent. But I think it has to be pragmatic and operation-oriented, to avoid "never getting there". Atom (feed paging and archiving) is basically designed with exactly this in mind, and it scaled to my use-cases (resources with multiple representations, plus opt. "attachments"), while still being simple enough to work for "just RDF updates". The missing piece is the deleted-entry/tombstone, for which there is thankfully at least an I-D. Therefore modelling the approach around these possibilities required a minimum of invention (none really, just some wording to descibe the practise), and it seems suited for a wide range of dataset syndication scenarios (not so much real-time, where XMPP may be relevant). At least this works very well as long as the datasets can be sensibly partitioned into documents (contexts/"graphs"). But this is IMHO is the best way to manage RDF anyhow (not the least since one can also leverage simple REST principles for editing; and since quad-stores/SPARQL-endpoints support named contexts etc). But I'd gladly discuss the benefit/drawback ratio of this approach in relation to our and others' scenarios. (I do think it would be nice to "lift" the resulting timeline to proper RDF -- e.g. AtomOwl (plus a Deletion for tombstones, provenance and logging etc). But these rather complex concepts -- datasources (dataset vs. collection vs. feed vs. page), timelines (entries are *events* for the same resource over time), "flat resource manifest" concepts, and so on -- require semantic definitions which will probably continue to be debated for quite some time! Atom can be leveraged right now. After all, this is a *very* instrumental aspect for most domains.) > this subject is probably not something that should be left for long > though.. my (personal) biggest worry about 'linked data' is that junk > data will be at an all time high, if not worse, and not nailing this on > the head early on (as in weeks/months at max) could contribute to the > mess considerably. Couldn't agree with you more. A common, direct (and "simple enough") way of syndicating datasets over time would be very beneficial, and shared practises for that seems to be lacking today. COURT <http://purl.org/net/court> is publically much of a strawman right now, but I would like to flesh it out. Primarily regarding the use of Atom I've described, but also with details of our implementation (the swedish legal information system), concerning collection and storage, proposed validation and URI-minting/verifying strategies, "lifting" the timeline for logging etc. (In what form and where the project's actual source code will be public remains to be decided (though opensourcing it has always been the official plan). Time permitting I will push my own work in the same vein there for reuse and reference. Regardless I trust the approach to be simple enough to be implementable from reading this mail-thread alone. ;) ) Best regards, Niklas Lindström
Received on Wednesday, 18 November 2009 17:03:57 UTC