- From: Hugh Glaser <hg@ecs.soton.ac.uk>
- Date: Wed, 6 Apr 2011 12:30:28 +0000
- To: Kingsley Idehen <kidehen@openlinksw.com>
- CC: "<nathan@webr3.org>" <nathan@webr3.org>, "public-lod@w3.org" <public-lod@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>
On 4 Apr 2011, at 15:16, Kingsley Idehen wrote: > On 4/4/11 10:06 AM, Nathan wrote: >> Kingsley Idehen wrote: >>> On 4/3/11 11:41 PM, Nathan wrote: >>>> Hi Kinglsey, All, >>>> >>>> Incoming open request, could anybody provide similar statistics for the usage of each datatype in the wild (e.g. the xsd types, xmlliteral and rdf plain literal)? >>>> >>>> Ideally Kingsley, could you provide a breakdown from the lod cloud cache? would be very very useful to know. >>>> >>>> Best & TIA, >>>> >>>> Nathan >>>> >>>> Kingsley Idehen wrote: >>>>> I've knocked up a Google spreadsheet that contains stats about our 21 Billion Triples+ LOD cloud cache. >>>> ... >>>>> https://spreadsheets.google.com/ccc?key=0AihbIyhlsQSxdHViMFdIYWZxWE85enNkRHJwZXV4cXc&hl=en -- LOD Cloud Cache SPARQL stats queries and results >>>> >>> >>> Nathan, >>> >>> The typed literals used in> 10k triples: >>> >>> count datatype IRI >>> 11308 xsd:anyURI >>> 12553http://dbpedia.org/datatype/day >>> 12788http://dbpedia.org/ontology/day >>> 15875http://dbpedia.org/ontology/usDollar >>> 18228http://dbpedia.org/datatype/usDollar >>> 20828http://europeanaconnect.eu/voc/fondazione/sgti#fondazioneNot >>> 22934http://statistics.data.gov.uk/def/administrative-geography/StandardCode >>> 23368http://www.w3.org/2001/XMLSchema#date >>> 30695http://dbpedia.org/datatype/inhabitantsPerSquareKilometre >>> 31662http://dbpedia.org/datatype/second >>> 35506http://dbpedia.org/datatype/kilometre >>> 57409http://www.w3.org/2001/XMLSchema#int >>> 160117http://stitch.cs.vu.nl/vocabularies/rameau/RecordNumber >>> 632256http://www.w3.org/2001/XMLSchema#anyURI >>> 1175435 xsd:string >>> 1696035http://data.ordnancesurvey.co.uk/ontology/postcode/Postcode >>> 70194534http://www.openlinksw.com/schemas/virtrdf#Geometry >>> 120147725http://www.w3.org/2001/XMLSchema#string >>> >>> Spreadsheet will be updated too. >>> >> >> Thanks Kingsley, very much appreciated! :) >> >> I have to admit I'm surprised by the lack of xsd:double and xsd:decimal in the two stats sets, and also the inclusion of some datatypes I'd never even heard of! >> >> Are there any virtuozo specific nuances which do some conversion, or are all of these as found in the serialized RDF? >> >> also is xsd:string automatically set for all plain literals (with / without langs?) >> >> Cheers, >> >> Nathan >> >> > > Data comes from internal table in Virtuoso. Note, a threshold has been set so what you are seeing is a picture relative to the total amount of data (21 Billion+ triples). Hi Kingsley. Thanks. So these numbers are absolute numbers of some fraction of the dataset? It would be good if that could be made clear - I certainly read your first message as being over the whole set, as I think did Dave and Nathan. Perhaps it would be clearer to present as a percentage? Also, if that is the case, is it a random sample, or might there be some artefacts in the system that skew towards some graphs or datasets? Best Hugh > > > -- > > Regards, > > Kingsley Idehen > President& CEO > OpenLink Software > Web: http://www.openlinksw.com > Weblog: http://www.openlinksw.com/blog/~kidehen > Twitter/Identi.ca: kidehen > > > > > > -- Hugh Glaser, Intelligence, Agents, Multimedia School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 78 9422 3822, Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/
Received on Wednesday, 6 April 2011 12:31:05 UTC