-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIND clause slows down some queries #3021
Comments
Hi @mpagni12 Looks like the optimizer is missing the best filter placement. Could you try the following 2 patterns which add
and also
Andy |
None of these help :-( I have also tried, without success: . . .
WHERE{
{
SELECT ?mnet WHERE{ BIND( reconx:vmh_Recon AS ?mnet ) } # fix a model here to focus the results
}
?mnet reconx:reac/reconx:equaSource/reconx:part/reconx:spec/reconx:chem ?chem_1 .
?mnxm mnx:chemXref ?chem_1, ?chem_2
FILTER( STR( ?chem_1 ) < STR( ?chem_2 ))
}
. . .
I wonder that the property path For information, in my dataset there are currently three instances for |
You can unbundle the path (although the query execution does that anyway) ##?mnet reconx:reac/reconx:equaSource/reconx:part/reconx:spec/reconx:chem ?chem_1 .
?mnet reconx:reac ?V1 .
?V1 reconx:equaSource ?V2 .
?V2 reconx:part ?V3 .
?V3 reconx:spec ?V4 .
?V4 reconx:chem ?chem_1 . Is the data publicly available? Does the inner part: SELECT DISTINCT ?mnet ?mnxm ?chem_1 ?chem_2
WHERE{
BIND( reconx:vmh_Recon AS ?mnet ) # fix a model here to focus the results
?mnet reconx:reac/reconx:equaSource/reconx:part/reconx:spec/reconx:chem ?chem_1 .
?mnxm mnx:chemXref ?chem_1, ?chem_2
FILTER( STR( ?chem_1 ) < STR( ?chem_2 ))
} behave the same way?
Is the data stored in TDB2? |
I have already attempted to unbundle the path, with no improvement in execution time. I can mention that the data structure behind the long property path is a dag, not a tree. IMPORTANT: The isolated inner part executed as a stand-alone query is fast! Hence, the problem I report seems to be linked to the graph pattern being executed in a sub-query. This greatly clarify the problem I think. The data are not yet officially released, but I can supply them to you by private email. However I guess that the problem should be easy to reproduce by introducing BIND clauses in the innermost graph pattern of nested sub-queries. The data are stored in TDB2 |
It's proving to be quite difficult to set up a simulation and be confident it illustrates the issue at your end. The optimizer plan doesn't look bad but there is a level below that which is more data shape sensitive. So the shape of the data appears to be a factor. |
Send me an email at Marco.Pagni@sib.swiss |
Sorry, I can't work with private data. I try to treat all bug reports the same. If I did this for one user, it would suggest it could be done for other users. |
I understand, it makes sense. I tend to be very cautious by default, as I am often working with sensitive or unpublished data from other researchers. But it is not the case here. Please find the dump of the dataset I have used in my above testing. |
Got it! 66,860,461 triples. |
And there are 288 results? |
yes |
As a temporary workaround: Removing The middle, next level out, |
Version
5.3.0
Question
Dear fuseki community,
I have observed several times with different queries, that the presence of a simple BIND clause can drastically slows down its execution.
For example, the following query
takes at least half an hour to execute on my local fuseki instance.
But, if I remove the BIND clause from the most inner WHERE clause, by inlining the subject:
the query now executes in a couple of second!
Replacing the BIND clause with a VALUES clause also executes very slowly.
The same query executed on GraphDB populated with the same dataset takes a couple of second to execute, with no significant differences between the three variants (inline, BIND, VALUES)
I tend to prefer to use the syntax with the explicit BIND or VALUES clause, because in a complex query it permits to syntactically highlight the "input parameter". But currently, the price is too high. I wonder it has to do with the query optimiser.
This being reported, thanks a lot for maintaining fuseki which is a great open-source tool.
Marco
The text was updated successfully, but these errors were encountered: