- From: Dan Brickley <danbri@danbri.org>
- Date: Wed, 16 May 2012 11:03:07 +0200
- To: RDF-WG Group <public-rdf-wg@w3.org>
In the interests of grounding discussion in real examples, http://inkdroid.org/journal/2012/05/15/diving-into-viaf/ is interesting. Ed talks about the VIAF dataset, which integrates library / cultural heritage metadata about people and their names. Behind the scenes they aggregate from many sources, but the publication model simplifies a lot of that. Still, what they publish as RDF is a set of graphs, rather than one big one: "The RDF Cluster Dataset http://viaf.org/viaf/data/viaf-20120422-clusters.xml.gz is 2.1G gzip compressed RDF data. Rather than it being one complete RDF/XML file, each line has a complete RDF/XML document on it, which represents a single cluster. All in all there are 20,379,541 clusters in the file." Lots more in Ed's post. I don't know why they chopped the giant graph into one per person. Or, seen another way, why they composited all the source graphs for each person into a single flat one. My hope for this WG's graphs work is that it'll make data manipulation and sharing at the graph level more natural and fluid, so that publishers like VIAF could share underlying provenance, and per-person aggregates, and per-source aggregates, as views into a common quad-soup... Dan
Received on Wednesday, 16 May 2012 09:03:40 UTC