Slow performance for queries with FROM that include GRAPH clause. #1753

aindlq · 2025-02-04T10:36:06Z

We are using RDF4J Java API to interact with QLever SPARQL endpoint. It has an API to retrieve all statements from one or multiple named graphs. At the moment it generates a query that is a valid but, in my opinion, slightly meaningless SPARQL query. See eclipse-rdf4j/rdf4j#5245

So for java call con.getStatements(null, null, null, ctx), where ctx is http://www.researchspace.org/resource/vocab/status. It generates:

SELECT * WHERE {  ?s ?p ?o . OPTIONAL { GRAPH ?ctx { ?s ?p ?o } } }

and then sets a default-graph-uri query parameter, which is equivalent to:

SELECT *
FROM <http://www.researchspace.org/resource/vocab/status>
WHERE {  ?s ?p ?o . OPTIONAL { GRAPH ?ctx { ?s ?p ?o } } }

For such query qlever is doing a full index scan:

My understanding is that when FROM is specified in a query, but FROM NAMED is not specified, then GRAPH clause is essentially no-op.

https://www.w3.org/TR/sparql11-query/#unnamedGraph

Each FROM clause contains an IRI that indicates a graph to be used to form the default graph. This does not put the graph in as a named graph.

The text was updated successfully, but these errors were encountered:

aindlq · 2025-02-04T11:05:40Z

It can be reproduced with any data, that query plan that I've attached is for the dataset that doesn't have the graph that was specified in FROM clause. When I try to execute the same query on your wikidata endpoint - https://qlever.cs.uni-freiburg.de/wikidata/h6dQ4E

I get:

Tried to allocate 162.2 GB, but only 20 GB were available. Clear the cache or allow more memory for QLever during startup

Actually the same issue is when only FROM NAMED is specified like:

SELECT *
FROM NAMED <http://www.researchspace.org/resource/vocab/status>
WHERE {  ?s ?p ?o . OPTIONAL { GRAPH ?ctx { ?s ?p ?o } } }

According to standard:

If there is no FROM clause, but there is one or more FROM NAMED clauses, then the dataset includes an empty graph for the default graph.

But qlever is scanning the whole index:

which result into:

Tried to allocate 162.2 GB, but only 20 GB were available. Clear the cache or allow more memory for QLever during startup

aindlq marked this as a duplicate of #1754 Feb 4, 2025

aindlq mentioned this issue Feb 7, 2025

Support QLever as a backend. researchspace/researchspace#381

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance for queries with FROM that include GRAPH clause. #1753

Slow performance for queries with FROM that include GRAPH clause. #1753

aindlq commented Feb 4, 2025

aindlq commented Feb 4, 2025

Slow performance for queries with FROM that include GRAPH clause. #1753

Slow performance for queries with FROM that include GRAPH clause. #1753

Comments

aindlq commented Feb 4, 2025

aindlq commented Feb 4, 2025