Re: on entailments and triple terms from James Anderson on 2024-08-19 (public-rdf-star-wg@w3.org from August 2024)

From: James Anderson <anderson.james.1955@gmail.com>
Date: Mon, 19 Aug 2024 23:44:11 +0200
To: RDF-star Working Group <public-rdf-star-wg@w3.org>
Message-Id: <60AECDAD-9BB1-477E-A1FE-0485246DC20E@gmail.com>

good evening;

> On 19. Aug 2024, at 18:06, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> It is indeed true that many SPARQL implementations do a poor job of optimizing queries that use the ontology facilities of RDFS.  It should be possible to run the query you provide at essentially the same speed as the version of the query that does not include subproperties on RDF graphs that have no relevant subproperty statements and with not much loss in speed on a graph that only has a few relevant subproperty statements (compared to running the simpler query on an RDF graph that has materialized the consequences of the subproperty statements).

i am curious, why one would make a broad claim on this order.
sparql formulations of the sort which combined those sorts of patterns freely in large queries which targeted graphs of large subject and object cardinalities would appear to constitute a significant challenge to an optimizer.
this even given the expressed content restrictions.

do you have any references to discussions about how one might in general optimize such a query type and/or benchmarks which demonstrate results on that topic?

best regards, from berlin,

>  If there are more than a few relevant subproperty statements then considerable extra time might be required I agree but my opinion is that optimizing performance on this kind of query is important.
> 
> 
> peter
> 
> 
> On 8/19/24 05:04, John Walker wrote:
>>> For example, many queries do not take into account rdfs:subpropertyOf when
>>> querying non-ontology facts.  This has probably led to underuse of
>>> rdfs:subpropertyOf.  Even worse, SPARQL is incapable of uniformly querying
>>> ontologies that have subproperties of rdfs:subclassOf.  So, for example,
>>> SPARQL cannot uniformly query for class instances in Wikidata, as Wikidata has
>>> properties that are subproperties of its version of rdfs:subclassOf.
>> One practical challenge of using rdfs:subPropertyOf in real-world queries is that
>> in many cases the subject and object of the triple pattern will be variables or
>> blank nodes. It is then necessary to make the predicate a variable to match the
>> sub-property pattern, which quickly gets complex if we are interested in multiple
>> predicates.
>> A query like this will likely not perform on a non-trivial dataset:
>> prefix ex: <http://example.com/>
>> select ?o1 ?o2 ?o3
>> where {
>>   [] ?p1 ?o1 ;
>>     ?p2 ?o2 ;
>>     ?p3 ?o3 .
>>   ?p1 rdfs:subPropertyOf* ex:prop1 .
>>   ?p2 rdfs:subPropertyOf* ex:prop2 .
>>   ?p3 rdfs:subPropertyOf* ex:prop3 .
>> }
>> Adding RDF-star or RDFS+++ entailments into the mix as part of a query can
>> only increase the complexity.
>> Regards,
>> John Walker
>> Principal Consultant & co-founder
>> Semaku B.V. | Torenallee 20 (SFJ 3D) | 5617 BC Eindhoven | T +31 6 42590072 | https://semaku.com/ <https://semaku.com/>
>> KvK: 58031405 | BTW: NL852842156B01 | IBAN: NL94 INGB 0008 3219 95


---
james anderson | james@dydra.com | https://dydra.com

Received on Monday, 19 August 2024 21:44:29 UTC