- From: Dan Brickley <danbri@danbri.org>
- Date: Mon, 14 Jul 2008 15:48:17 -0400
- To: Tom Heath <Tom.Heath@talis.com>
- Cc: Richard Cyganiak <richard@cyganiak.de>, Mark Birbeck <mark.birbeck@webbackplane.com>, public-lod@w3.org, semantic-web@w3.org
Tom Heath wrote: > > As always it's a case of the right tool for the right job. Regarding > your other (admittedly unfounded) claim, there may be many more people > who end up publishing RDF as RDFa, but collectively they may end up > publishing far fewer triples in total than a small number of publishers > with very large data sets who choose to use RDF/XML to expose data from > backend DBs. Hey, size isn't everything :) Generating a massive RDF dataset is as easy as piping one's HTTP logs through sed. There are many measures for data utility. Is the data fresh? accurate? useful? maintained? *used*? Does it exploit well known vocab? Does it use identifiers that other people use? Or identification strategies that allow cross-reference with other data anyway? Are the associated http servers kept patched and secure? Is it available over SSL? Is there at least 5 years paid up on each associated DNS hostname used? Do we know who owns and takes care of those domain names? Does it link out? do people link in? Does the data have clear license? And respect user's privacy wishes where appropriate? Is it I18N-cool? On the size questsion: I'm wary of encouraging a 'bigger is better' attitude to triple count. In data as in prose, brevity is valuable. Extra triples add cost at the aggregation and querying level; eg. sometimes a workplaceHomepage triple is better than having a 'workplace' one and a 'homepage one'. cheers, Dan -- http://danbri.org/
Received on Monday, 14 July 2008 19:49:02 UTC