Re: Re: [Wikidata] Announce: New OpenLink Virtuoso hosted Wikidata Knowledge Graph Release

On 12.01.23 01:33, Dan Brickley wrote:
>
> Really cool! :)
>
> If anyone has eg student project possibilities, it would be great to 
> see some work on Wikidata SPARQL query portability- eg working through 
> the list at query.wikidata.org <http://query.wikidata.org>, which tend 
> to look like this:
>
>
> SELECT ?item ?itemLabel
> WHERE
> {
>   ?item wdt:P31 wd:Q146. # Must be of a cat
>   SERVICE wikibase:label { bd:serviceParam wikibase:language 
> "[AUTO_LANGUAGE],en". } # Helps get the label in your language, if 
> not, then en language
> }
>
> which won’t work as-is outside of the current Wikidata SPARQL 
> Blazegraph endpoint.
>
> Something like this is needed (with a filter for lang too):
>
>
> SELECT ?item ?itemLabel
> WHERE
> {
>   ?item wdt:P31 wd:Q146; rdfs:label ?itemLabel
> }
>
> I don’t recall where the Wikidata sample queries live (github? Wiki 
> somewhere) but it would be lovely to hear if they could all run on an 
> alternative backend…

There are plenty of those collections, one is a Wiki: 
https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples

And yes, there are queries that can't be run in its current form on 
other triple store. In particular all queries that make use of SERVICE 
request to their internal setup like the MWAPI, the LABEL service, their 
GAS API (graph traversal), Geospatial extension etc. can't be run 
outside of Blazegraph.

Some of those could be (partially) rewritten to standard SPARQL, but 
might indeed lead to performance issues. For the queries with GAS I 
doubt this can be replaced completely, even not with property paths, and 
other triple stores have their own graph traversal implementation 
nowadays (e.g. Stardog, GraphDB, Virtuoso). For the MWAPI, only the 
entity search feature could be rewritten, but even then, SPARQL has no 
standard for efficient fulltext search (yet ... hope for SPARQL 1.2). 
The spatial extension should be replaceable with GeoSPARQL and then we 
have to hope that the triple stores provide full GeoSPARQL support

Lorenz

>
> Dan
>
>
> On Wed, 11 Jan 2023 at 15:52, Kingsley Idehen via Wikidata 
> <wikidata@lists.wikimedia.org> wrote:
>
>     All,
>
>     We are pleased to announce immediate availability of an new
>     Virtuoso-hosted Wikidata instance based on the most recent
>     datasets. This instance comprises 17 billion+ RDF triples.
>
>     Host Machine Info:
>
>     Item  Value
>
>     CPU
>
>      
>
>     |2x Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz|
>
>     Cores
>
>      
>
>     |24|
>
>     Memory
>
>      
>
>     |378 GB|
>
>     SSD
>
>      
>
>     |4x Crucial M4 SSD 500 GB|
>
>
>     Cloud related costs for a self-hosted variant, assuming:
>
>      *
>
>         dedicated machine for 1 year without upfront costs
>
>      *
>
>         128 GiB memory
>
>      *
>
>         16 cores or more
>
>      *
>
>         512GB SSD for the database
>
>      *
>
>         3T outgoing internet traffic (based on our DBpedia statistics)
>
>
>     vendor  machine type  memory  vCPUs  monthly machine  monthly disk
>      monthly network  monthly total
>
>     Amazon
>
>      
>
>     r5a.4xlarge
>
>      
>
>     128 GiB
>
>      
>
>     16
>
>      
>
>     $479.61
>
>      
>
>     $55.96
>
>      
>
>     $276.48
>
>      
>
>     $812.05
>
>     Google
>
>      
>
>     e2highmem-16
>
>      
>
>     128 GiB
>
>      
>
>     16
>
>      
>
>     $594.55
>
>      
>
>     $95.74
>
>      
>
>     $255.00
>
>      
>
>     $945.30
>
>     Azure
>
>      
>
>     D32a
>
>      
>
>     128 GiB
>
>      
>
>     32
>
>      
>
>     $769.16
>
>      
>
>     $38.40
>
>      
>
>     $252.30
>
>      
>
>     $1,060.06
>
>
>     SPARQL Query and Full Text Search service endpoints:
>
>      *
>
>         https://wikidata.demo.openlinksw.com/sparql -- SPARQL Query
>         Services Endpoint
>
>      *
>
>         https://wikidata.demo.openlinksw.com/fct -- Faceted Search &
>         Browsing
>
>
>     Additional Information
>
>      *
>
>         Loading the Wikidata dataset 2022/12 into Virtuoso Open Source
>         - Announcements - OpenLink Software Community (openlinksw.com)
>         <https://community.openlinksw.com/t/loading-the-wikidata-dataset-2022-12-into-virtuoso-open-source/3580>
>
>
>     Happy New Year!
>
>     -- 
>     Regards,
>
>     Kingsley Idehen 
>     Founder & CEO
>     OpenLink Software
>     Home Page:http://www.openlinksw.com
>     Community Support:https://community.openlinksw.com
>     Weblogs (Blogs):
>     Company Blog:https://medium.com/openlink-software-blog
>     Virtuoso Blog:https://medium.com/virtuoso-blog
>     Data Access Drivers Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
>     Personal Weblogs (Blogs):
>     Medium Blog:https://medium.com/@kidehen
>     Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
>                    http://kidehen.blogspot.com
>
>     Profile Pages:
>     Pinterest:https://www.pinterest.com/kidehen/
>     Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
>     Twitter:https://twitter.com/kidehen
>     Google+:https://plus.google.com/+KingsleyIdehen/about
>     LinkedIn:http://www.linkedin.com/in/kidehen
>
>     Web Identities (WebID):
>     Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>              :http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>     _______________________________________________
>     Wikidata mailing list -- wikidata@lists.wikimedia.org
>     Public archives at
>     https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/TI7U5Q6ZBEEPCNSTZ2KYLEXEDO4E4GMG/
>     To unsubscribe send an email to wikidata-leave@lists.wikimedia.org
>

Received on Thursday, 12 January 2023 07:58:00 UTC