- From: Jean-Claude Moissinac <jean-claude.moissinac@telecom-paristech.fr>
- Date: Wed, 13 Jul 2016 18:05:12 +0200
- To: John Walker <john.walker@semaku.com>
- Cc: Hugh Williams <hwilliams@openlinksw.com>, public-lod <public-lod@w3.org>
- Message-ID: <CAP8HVi1OejdmgQH9j2RV==+Y2LgLhj3sdMUcvpykfiC=3OXAAw@mail.gmail.com>
Many thanks John for the elegant solution. My perception is that select count(distinct ?r) where { ?r ?p ?l } is semantically equivalent to select (count(?s) as ?c) where { select distinct ?s where { ?s ?p []} } It gives the count of distinct nodes in the graph, so the difference is only a result of the internal implementation. So, it seems necessary to know a lot about implementation to know how to get the result. Am I wrong? -- Jean-Claude Moissinac 2016-07-06 15:55 GMT+02:00 John Walker <john.walker@semaku.com>: > How about reformulating as: > > select (count(?s) as ?c) where { select distinct ?s where { ?s ?p []} } > > Which gives a result of 10515620 resources [1]. > > Regards, > John > > [1] > http://fr.dbpedia.org/sparql?default-graph-uri=&query=select+%28count%28%3Fs%29+as+%3Fc%29+where+%7B+select+distinct+%3Fs+where+%7B+%3Fs+%3Fp+%5B%5D%7D+%7D&format=text%2Fhtml&timeout=0&debug=on > > > -----Original Message----- > From: Hugh Williams [mailto:hwilliams@openlinksw.com] > Sent: Wednesday, July 06, 2016 3:15 PM > To: Jean-Claude Moissinac <jean-claude.moissinac@telecom-paristech.fr> > Cc: public-lod <public-lod@w3.org> > Subject: Re: Size a linked open data set > > Hi Jean-Claude, > > The "select count(distinct ?r) where { ?r ?p ?l }” query is expensive in > terms of database resources and would result in a huge hash table being > creating to try and service it which is causing it to timeout based on the > settings on the instance by whoever maintains it. > > On http://dbpedia.org/sparql the original canonical English DBpedia > endpoint OpenLink Software hosts, we provide preloaded VOID datasets, such > that they don’t have to be queried each time, see > http://dbpedia.org/void/Dataset , but the French DBpedia instance does > not appear to have this ie http://fr.dbpedia.org/void/Dataset > > Best Regards > Hugh Williams > Professional Services > OpenLink Software, Inc. // http://www.openlinksw.com/ > Weblog -- http://www.openlinksw.com/blogs/ > LinkedIn -- http://www.linkedin.com/company/openlink-software/ > Twitter -- http://twitter.com/OpenLink > Google+ -- http://plus.google.com/100570109519069333827/ > Facebook -- http://www.facebook.com/OpenLinkSoftware > Universal Data Access, Integration, and Management Technology Providers > > > On 6 Jul 2016, at 12:49, Jean-Claude Moissinac < > jean-claude.moissinac@telecom-paristech.fr> wrote: > > > > Hello > > > > In my work, I need to know the number of distinct resources in a dataset. > > For example, with dbpedia-fr, I'm trying > > select count(distinct ?r) where { ?r ?p ?l } > > > > And I'm always getting a timeout error message > > While with > > select count(?r) where { ?r ?p ?l } > > I'm getting > > 185404575 > > > > Is it a good way to know about such size? > > > > -- > > Jean-Claude Moissinac > > > >
Received on Wednesday, 13 July 2016 16:06:00 UTC