Re: [foaf-dev] [foaf-protocols] FOAF sites offline during cleanup

Lee Feigenbaum wrote:
> Kingsley Idehen wrote:
>> Lee Feigenbaum wrote:
>>> [trimmed to: and cc: list a bit]
>>>
>>> Kingsley Idehen wrote:
>>>> Dan Brickley wrote:
>>>>>
>>>>> What % of "linked data" is truly free of bnodes?
>>>>>   
>>>> Dan,
>>>>
>>>> I would safely say re. LOD Cloud somewhere north of 80% :-) And 
>>>> thats primary due to the content coming from PingTheSemanticWeb, 
>>>> otherwise I would say 90% and higher. The "Linked Data" meme has 
>>>> always encouraged URIs for everything.
>>>
>>> This discussion is interesting to me. Kingsley's comment made me say 
>>> "huh, does dbpedia really only use URIs?"
>>>
>>> so I ran:
>>>
>>> select count(distinct ?s) where { ?s ?p ?o . filter(isblank(?s)) }
>>>
>>> at http://dbpedia.org/sparql and received a result of 1330.
>>>
>>> (i trired to compare with URIs by querying with isuri or with no 
>>> filter, but those queries timed out)
>>>
>>> so there seem to be a few blank nodes scattered there, but not many. 
>>> i wanted to get an idea of what these blank nodes are used for, so i 
>>> did:
>>>
>>> select distinct ?p where { ?s ?p ?o . filter(isblank(?s)) }
>>>
>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>> http://www.w3.org/2002/07/owl#unionOf
>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#first
>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#rest
>>>
>>> ...which made it somewhat clear that blank nodes are used in dbpedia 
>>> for RDF lists and (?) anonymous classes.
>>>
>>> Anyway.
>>>
>>> Lee
>>>
>> Lee,
>>
>> Nice analysis, but you should have used: 
>> http://lod.openlinksw.com/sparql (this is the LOD cloud datasets in a 
>> Virtuoso Cluster, and its much faster).
>>
>> If you want to scope your query to DBpedia then just use the Graph 
>> IRI: <http://dbpedia.org> .
>
> "Should have" in what sense? :-)
>
> I tried my original query that told me about the 1,330 blank nodes on 
> dbpedia at this new endpoint, and it timed out.
>
> Lee
>
Lee,

It shouldn't time out since there is an execution timeout feature which 
should give you partial results in the worst case. Anyway, there is a 
/sparql endpoint bug which your test revealed.

Check back later today, or early tomorrow, and the endpoint should give 
partial results in worst case.

So the LOD instance should never give you an empty solution, just a 
partial solution based on your response time threshold.


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Monday, 27 April 2009 19:12:56 UTC