W3C home > Mailing lists > Public > public-lod@w3.org > September 2018

Re: Release of SAGE 1.0: a stable, responsive and unrestricted SPARQL query server

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 10 Sep 2018 11:30:51 -0400
To: Pascal Molli <pascal.molli@univ-nantes.fr>
Cc: public-lod@w3.org
Message-ID: <719288ab-5e2a-83b0-b8f4-71b71bece4e8@openlinksw.com>
On 9/10/18 10:35 AM, Pascal Molli wrote:
> Dear all,
>
> Thanks all for this nice discussion.

Hi Pascal,

>
> Our key message is that it is possible to build a performant public
> LOD server that ensures stability, responsiveness 
> and complete results for any SPARQL query.

Does this apply in all circumstances where the following hold true?

1. Datasets operated on via SPARQL Query Language
<https://www.w3.org/TR/sparql11-query#this> are volatile
    -- new datasets loaded randomly to service providing SPARQL Query access
    -- new datasets loaded randomly to service as a result of
progressive crawling (e.g., de-referencing variables and constants in
sparql query body)

2. No setup and reconfiguration in response to, or in anticipation of,
the above

>
> By using quotas, SPARQL endpoints ensure stability and responsiveness
> but sacrifice completeness.

Not in the case of Virtuoso which I keep on trying to tell you that you
are misrepresenting (inadvertently) re. "Anytime Query" feature.

The Anytime Query feature operates on the following premise, at Web Scale:

1. Every SPARQL Query is a quiz question
2. Every quiz question is allotted a set amount of time for answer provision
3. Unlike a typical quiz, you can ask for more time
4. A solution is eventually provided.

This means that providing a complege query solution isn't a variable.
The time it takes to provide a complete solution, subject to instance
configuration (e.g., the "Fair Use" settings for DBpedia) is the only
variable.

This applies to tiny commodity servers or the kind of config used in
your tests.

Any deviation from what I am stating above is at the very worst a bug in
the edition of Virtuoso you may have used in your testing.

What I have described above is how Virtuoso's "Anytime Query"
functionality works across SPARQL or SQL.

> One approach to solve this problem is to decompose a query  into a
> set of subqueries that terminate under quotas with complete
> results (Limit/Offset...). This approach raises a major issue: how to
> ensure that such decomposition exists for any query?  This requires to
> evaluate the execution time 
> of a query and the number of results on a server with an unknown load.
> Consequently, it seems 
> impossible to me to build a server that ensures stability,
> responsiveness and completeness for 
> any query following the W3C SPARQL protocol.

Please see my comments above. The time to produce a complete solution is
the only variable. Not the ability to produce said solution.

>  
> As pointed out by Ruben, TPF and SaGE execute SPARQL query without
> following the W3C SPARQL protocol. 
> By changing the server interface,  they ensure stability,
> responsiveness and completeness of any query. Compared to TPF, 
> SaGE reduces drastically data transfer and execution time thanks to
> BGP support on server side. 
>
> Now if we compare SaGe and Virtuoso *without quotas*, we demonstrated
> that Virtuoso is not stable, and is outperformed 
> by SaGe when the load is increasing.
You claim
"
Now if we compare SaGe and Virtuoso *without quotas*, we demonstrated
that Virtuoso is not stable, and is outperformed 
by SaGe when the load is increasing.
"
is inaccurate.

It would help if you respond to the points I've outlined above.

BTW -- I couldn't find the Virtuoso INI file used for your tests.

[1]
https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit?ouid=112399767740508618350&usp=sheets_home&ths=true
-- Ideally, it should be added to this publicly accessible spreadsheet

Kingsley
>
> Any feedback is welcome.
>
> --
> Pascal
>
> On Sun, 9 Sep 2018 at 23:49, Kingsley Idehen <kidehen@openlinksw.com
> <mailto:kidehen@openlinksw.com>> wrote:
>
>     On 9/9/18 4:20 PM, Hugh Glaser wrote:
>     > So I don't understand why you seem obsessed with the idea that
>     these researchers should give you access to use their resources,
>     when what you presumably want to do is repeat their experiments in
>     your own environment and control, so you can be confident off the
>     results.
>
>
>     I never said or insinuated that.
>
>     I just wanted them to clarify their key message, which is
>     confusing for
>     a variety of reasons already outlined.
>
>
>     -- 
>     Regards,
>
>     Kingsley Idehen       
>     Founder & CEO
>     OpenLink Software   (Home Page: http://www.openlinksw.com)
>
>     Weblogs (Blogs):
>     Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
>     <http://www.openlinksw.com/blog/%7Ekidehen/>
>     Blogspot Blog: http://kidehen.blogspot.com
>     Medium Blog: https://medium.com/@kidehen
>
>     Profile Pages:
>     Pinterest: https://www.pinterest.com/kidehen/
>     Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
>     Twitter: https://twitter.com/kidehen
>     Google+: https://plus.google.com/+KingsleyIdehen/about
>     LinkedIn: http://www.linkedin.com/in/kidehen
>
>     Web Identities (WebID):
>     Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>             :
>     http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
>
>
> -- 
> Pascal Molli <http://pagesperso.lina.univ-nantes.fr/%7Emolli-p>
> Full Professor, Nantes University <http://www.univ-nantes.fr/> 
> Head of GDD <https://sites.google.com/site/gddlina/>_ team_, LS2N
> <http://ls2n.fr/>, 
> UFR de Sciences et Techniques 
> 2, rue de la Houssinière 
> BP 92208 
> 44322 NANTES CEDEX 3 
> Tel : +33 251125810 
> pascal.molli@univ-nantes.fr <mailto:pascal.molli@univ-nantes.fr> 


-- 
Regards,

Kingsley Idehen	      
Founder & CEO 
OpenLink Software   (Home Page: http://www.openlinksw.com)

Weblogs (Blogs):
Legacy Blog: http://www.openlinksw.com/blog/~kidehen/
Blogspot Blog: http://kidehen.blogspot.com
Medium Blog: https://medium.com/@kidehen

Profile Pages:
Pinterest: https://www.pinterest.com/kidehen/
Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter: https://twitter.com/kidehen
Google+: https://plus.google.com/+KingsleyIdehen/about
LinkedIn: http://www.linkedin.com/in/kidehen

Web Identities (WebID):
Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
        : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this


Received on Monday, 10 September 2018 15:31:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:22:47 UTC