- From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- Date: Tue, 12 Apr 2005 17:04:01 +0100
- To: DAWG public list <public-rdf-dawg@w3.org>
Based on my experience of implementing the current editors WD and helping build an application. The local student radio station (http://www.surgeradio.co.uk) uses RDF to describe its playlists, handle requests and so on. They use the Musicbrainz RDF (split into files per artist and disk to make applying updates more efficient) to talk about released CDs, and some locally created data (split same way) to talk about white label singles thay have received through the post. If/when the white label stuff gets released they send it to musicbrainz and remove the local copy. All the data is "trusted" and so exitsts in the background graph, but its also kept in named graphs to allow provenential queries (MB v's local, who wrote it etc.) to be answered. My first thought about how to handle this case was to flag the graphs as being in the background/named/both graph sets which allows me to store this efficiently, but it makes queries too expensive, and in my currentl implementation at least bNodes get shared between the background and named graphs, which only matters in corner cases, but does change the smenantics. My final implementation was a naive implementation of whats in the spec, as I understand it. I used a distinguished graph ideentified to distinguish things in the background graph. I think assertion performance is bad, but I've not worked on it. However, using this implementation I then couldn't remove subsets of the background graph (eg. locally created graphs that are now redundant). The named part of the data can be removed easily, by using its graph idetenifier, but all triples in the background graph cant be distinguished in my implementation. I would be possible to subidentify the triples in the background graph in som way, but that identification can't be discovered from SPARQL which would make extending it to be INSERT/UPDATE in the future painful, and would complicate the data storge. Another option I considered was to keep a copy of the graph as asserted, and remove it when reqested, but it gets a bit complicated as I have to keep a count on the numer of times any particular statement has been asserted in the background graph, and I'm concerned about synchronisation issues. The design I posted earlier (http://lists.w3.org/Archives/Public/public-rdf-dawg/2005JanMar/0440.html) turns out not to have this problem (though that wasn't what motivated the design). As all graphs are named the application can do management on data about individual disks in the background graph. - Steve
Received on Tuesday, 12 April 2005 16:04:05 UTC