Re: LOD Cloud Cache Stats

On 4/5/11 3:42 PM, William Waites wrote:
> So I don't have answers to your questions, but do have some
> observations about the results, particularly the counts of
> distinct predicates.
>
> The top one is rdf:type which makes sense. Below that we
> have ones used in reification. Who knew there was actually
> that much reified data out there? I wonder where this comes
> from and what about the consensus that this is not a good
> idea and should be deprecated?
>
> SELECT DISTINCT ?graph, COUNT(?s) AS ?count WHERE {
>      GRAPH ?graph { ?s 
> ?p<http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement>  }
> } ORDER BY DESC(?count) LIMIT 50
>
> This query times out, but it would be interesting to know
> the answer, who is the source of all of these reifications?

Yes, that will timeout via the public SPARQL endpoint. We'll run it 
internally to get the numbers.

> Next is rdfs:label, ok, fine. After that, a sizeable chunk
> of data has to do with rows and columns in CSV tables that
> comes from data.gov.

No, that's RDF from RPI's (Jim Hendler's team) conversion of Data.Gov 
datasets. That accounts for about 6.4 Billion triples re. total 
contribution.

> How is a mechanical transliteration
> from CSV to RDF without any modelling useful?

That's a question for the team at RPI :-)

> It just makes
> the data a couple of orders of magnitude bigger and a few
> more orders of magnitude more cumbersome to deal with.

Yes and No. As will all of these matter utility lies in the eyes and 
fingers of the data beholder.

>   I
> mean, being able to refer to a specific spreadsheet cell is
> useful but how does actually materialising all of them do
> anything but take up disk space and slow down queries?

See comments above :-)

> Cheers,
> -w


-- 

Regards,

Kingsley Idehen
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

-- 

Regards,

Kingsley Idehen	
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Received on Tuesday, 5 April 2011 20:18:02 UTC