Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of WebDAV search requests with many shared files #37061

Closed
wants to merge 8 commits into from

Conversation

starypatyk
Copy link
Contributor

I have noticed that when many files are shared between users (e.g. via Talk) the WebDAV search requests are quite slow. This is visible in the Photos app when entering the initial page or in the Android client when entering the Media tab. I suspect that this issue #35776 might have the same cause.

After some analysis I have found that in such situation the database query that retrieves the list of files is very complex and takes significant time to execute. For each shared file the following section is added to the SQL WHERE clause:

((`storage` = :dcValueNN) AND (`path_hash` = :dcValueMM)) OR 

In my environment I see queries with over 250 sections like this. That makes the query use over 500 separate parameters.

Since usually many files are shared between two users, the storage part of the filter is repeated many times. This provides an opportunity to refactor the query to use the following form (and provide all path_hash values for the same storage in an array):

((`storage` = :dcValueNN) AND (`path_hash` IN (:dcValueMM))) OR

This reduces the number of parameters significantly (in my case to about 25) and seems easier to optimize by the database engine.

In this PR I propose a change to the QuerySearchHelper::searchInCaches method that generates the query refactored as described above.

In my environment (MySQL database, ~250k rows in the oc_filecache table) I measured a very nice improvement in the query execution time (retrieval of the first 200 images in the Photos app):

  • before the change: ~3700 ms
  • after the change: ~750 ms

Comments? Thoughts?

@szaimen szaimen requested review from icewind1991, rullzer, a team, ArtificialOwl and come-nc and removed request for a team March 6, 2023 23:03
@starypatyk starypatyk force-pushed the dav-search-use-sql-in branch from 6291a67 to 1cadbaf Compare March 7, 2023 22:18
@@ -127,9 +161,7 @@ public function searchInCaches(ISearchQuery $searchQuery, array $caches): array
));
}

$storageFilters = array_values(array_map(function (ICache $cache) {
return $cache->getQueryFilterForStorage();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can see this was using path and not path hash, did I miss something? Where was the md5 calculation before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed tricky to follow.

Sometimes the path comparison is not converted into path_hash. If I understand correctly the hint used above can be set by the query optimizer invoked from:

$this->queryOptimizer->processOperator($filter);

The actual setting of the hint is here:

if ($a instanceof ISearchComparison && $b instanceof ISearchComparison && $a->getField() === 'path' && $b->getField() === 'path') {

If I understand all of this correctly, for single file shares the path_hash comparison was always used before my changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I would feel a lot better if someone who actually understand DB queries perf can review this. I do not have enough knowledge.

We have to be sure your changes does not slow down some other usecases. And it’s complicated to test because some of the optimisations we have are database-specific I think.

@come-nc come-nc requested a review from artonge March 13, 2023 10:47
@come-nc
Copy link
Contributor

come-nc commented Mar 13, 2023

Thoughts:

  1. Maybe the support for IN in the SearchBuilder should work on path and automagically convert to path_hash like it’s done for =.
  2. If possible this should be split into one PR that adds the support for IN and one that uses it to ease revert in case we get into trouble.

@starypatyk
Copy link
Contributor Author

@come-nc

  1. Maybe the support for IN in the SearchBuilder should work on path and automagically convert to path_hash like it’s done for =.

Seems doable - but feels more complex and likely more difficult to analyse. On the other hand, I am now wondering if the PathPrefixOptimizer is used at all after my changes. I am not sure how to verify this. I see that the QueryOptimizer is a relatively new piece of code (Aug 2021) created by @icewind1991.

  1. If possible this should be split into one PR that adds the support for IN and one that uses it to ease revert in case we get into trouble.

Certainly possible. Should I go ahead and split this into a separate PR?

@XueSheng-GIT
Copy link

@starypatyk if you need someone to do some testing, just let me know. I've a lot of shares and search is really slow (takes longer than a minute).

@starypatyk starypatyk force-pushed the dav-search-use-sql-in branch from 1cadbaf to 7c958d1 Compare June 8, 2023 19:48
ISearchBinaryOperator::OPERATOR_AND,
[
new SearchComparison(ISearchComparison::COMPARE_EQUAL, 'storage', $storage),
new SearchComparison(ISearchComparison::COMPARE_IN, 'path_hash', array_map(fn ($path) => md5($path), $paths))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful with "IN" it'll break i.e. on Oracle for 1000+ items.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Would it make sense to provide a workaround at a lower level?

Rewrite the query with a sequence of ((x IN (list1-1000)) or (x IN (list1001-2000)) ...) e.g. here:

public function in($x, $y, $type = null): string {

I thought this might be resolved already by Doctrine DBAL, but apparently a similar workaround has been rejected. 😞

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell since I am not an expert on php/doctrine, but @nickvergessen @juliushaertl or @icewind1991 might know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using (IN(1-1000) OR IN(1001-2000)) in mail: https://github.com/nextcloud/mail/blob/0da32c97f64e0cdf950368257447ec8fa6fe7fea/lib/Db/MessageMapper.php#L986-L992.

Not a big fan, but there is no other way if you want to work with all IDs in one result.

@starypatyk
Copy link
Contributor Author

@XueSheng-GIT

@starypatyk if you need someone to do some testing, just let me know. I've a lot of shares and search is really slow (takes longer than a minute).

Would be great to know if these changes improve performance in other cases. 👍

Would you be able to pick up my changes directly from GitHub? I have rebased this PR today - it should be easier to apply it on a recent Nextcloud version.

@XueSheng-GIT
Copy link

I've applied this pull to my testing instance (NC 27.0.0). I've added 1,000 shares via talk to a test account (share_NNNN.txt).
Because this is especially about webdav search, i took the official reference to perform a search: https://docs.nextcloud.com/server/27/developer_manual/client_apis/WebDAV/search.html

Search for filename 'test' (I hope that's the right way to do this, I never used webdav search so far ;) ):

$ time curl -u username:pass 'https://cloud.domain.invalid/remote.php/dav/' -X SEARCH -H "content-Type: text/xml" --data '<?xml version="1.0" encoding="UTF-8"?>
 <d:searchrequest xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
     <d:basicsearch>
         <d:select>
             <d:prop>
                 <oc:fileid/>
                 <d:displayname/>
                 <d:getcontenttype/>
                 <d:getetag/>
                 <oc:size/>
             </d:prop>
         </d:select>
         <d:from>
             <d:scope>
                 <d:href>/files/username</d:href>
                 <d:depth>infinity</d:depth>
             </d:scope>
         </d:from>
         <d:where>
             <d:like>
                 <d:prop>
                     <d:displayname/>
                 </d:prop>
                 <d:literal>test</d:literal>
             </d:like>
         </d:where>
         <d:orderby/>
    </d:basicsearch>
</d:searchrequest>'
<?xml version="1.0"?>
<d:multistatus xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns" xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns"/>

real	0m31,679s
user	0m0,043s
sys	0m0,008s

Although, I've applied the patch, the query log still includes all those blocks:
(("storage" = $39) AND ("path" = $40)) OR

Full log:
webdav-search_patched.log

@starypatyk, anything special which must be done after applying this pull?

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

The patch should "just work" - no special configuration required.

For the next two weeks I will not be able to diagnose this. I should have time in early July.

@starypatyk
Copy link
Contributor Author

starypatyk commented Jul 6, 2023

Hi @XueSheng-GIT,

Your WebDAV search query looks correct. I have tried it on my dev instance and it behaves as expected - i.e. the sequence of individual conditions like:

((`storage` = :dcValue5) AND (`path` = :dcValue6)) OR

gets replaced by a much shorter list of:

((`storage` = :dcValue5) AND (`path_hash` IN (:dcValue6))) OR

I do not understand yet what may be the reason this does not work for you. I have committed additional diagnostic code that might help me diagnose the issue.

Would you be able to pick the newest version of code from this PR and re-run your tests?

Please make sure to set logging level to DEBUG as described in the documentation and provide relevant fragment of the nextcloud.log file. There should be many lines similar to this:

{"reqId":"vJekSFwbXgIqyHuokfVX","level":0,"time":"2023-07-06T22:07:54+00:00","remoteAddr":"192.168.21.4","user":"admin","app":"no app in context","method":"SEARCH","url":"/remote.php/dav/?log_secret=f9e9137b0bfde6498e70b75c91f6a5c8","message":"generateStorageFilters: cache: OCA\\Files_Sharing\\Cache","userAgent":"curl/7.81.0","version":"28.0.0.0","data":[]}

@XueSheng-GIT
Copy link

XueSheng-GIT commented Jul 7, 2023

@starypatyk thanks for following up.

I've picked up the newest version. Loglevel was already set to DEBUG on my test instance.
nextcloud.log contains many of those lines. But Search query is still unchanged and slow.

{"reqId":"fbrG5wnYyA9kLkOlSfVt","level":0,"time":"2023-07-07T14:29:16+02:00","remoteAddr":"fd64:c2a2:5c5e:e201:703c:a494:309:f1e7","user":"usename","app":"no app in context","method":"SEARCH","url":"/remote.php/dav/","message":"generateStorageFilters: cache: OCA\\FilesAccessControl\\CacheWrapper","userAgent":"curl/7.81.0","version":"27.0.0.8","data":[]}

Do you need a more complete log? I don't want to paste it here for privacy reasons.

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

Thank you for the log line. It does help much 👍 A key difference is here:

My environment:

"generateStorageFilters: cache: OCA\\Files_Sharing\\Cache"

Your environment:

"generateStorageFilters: cache: OCA\\FilesAccessControl\\CacheWrapper"

I suspect that you use the File access control app and my code is not prepared to handle this case.

I need to rethink the approach.

@XueSheng-GIT
Copy link

XueSheng-GIT commented Jul 9, 2023

I've disabled File access control to double check the results.

nextcloud log now contains:

{"reqId":"KeYhbHbr8HH0mwj2J6kC","level":0,"time":"2023-07-09T07:28:42+02:00","remoteAddr":"fd64:c2a2:5c5e:e201:b442:4193:b930:e5c8","user":"username","app":"no app in context","method":"SEARCH","url":"/remote.php/dav/","message":"generateStorageFilters: cache: OCA\\Files_Sharing\\Cache","userAgent":"curl/7.81.0","version":"27.0.0.8","data":[]}

Webdav search used:

$ time curl -u username:pass 'https://cloud.domain.invalid/remote.php/dav/' -X SEARCH -H "content-Type: text/xml" --data '<?xml version="1.0" encoding="UTF-8"?>
 <d:searchrequest xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
     <d:basicsearch>
         <d:select>
             <d:prop>
                 <oc:fileid/>
                 <d:displayname/>
                 <d:getcontenttype/>
                 <d:getetag/>
                 <oc:size/>
             </d:prop>
         </d:select>
         <d:from>
             <d:scope>
                 <d:href>/files/username</d:href>
                 <d:depth>infinity</d:depth>
             </d:scope>
         </d:from>
         <d:where>
             <d:like>
                 <d:prop>
                     <d:displayname/>
                 </d:prop>
                 <d:literal>search-string</d:literal>
             </d:like>
         </d:where>
         <d:orderby/>
    </d:basicsearch>
</d:searchrequest>'
<?xml version="1.0"?>
<d:multistatus xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns" xmlns:oc="http://owncloud.org/ns" xmlns:nc="http://nextcloud.org/ns"/>

real	0m1,117s
user	0m0,052s
sys	0m0,004s

pgsql query log:
webdav-search_patched+disabledfac.log

That's a major increase in search speed. 31s reduced to 1s!!!
Thanks a lot for working on this @starypatyk!

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

That's a major increase in search speed. 31s reduced to 1s!!!

Great to know that this change can help for your case! Thanks 👍

I am now considering a different approach that should also work with the File access control plugin.

BTW - The queries in your logs seem to suggest that you use PostgreSQL as the database backend. Could you please confirm/deny?

@XueSheng-GIT
Copy link

BTW - The queries in your logs seem to suggest that you use PostgreSQL as the database backend. Could you please confirm/deny?

Yes, I'm using postgreSQL.

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

I have implemented a slightly different approach. The code should now work also when the File access control plugin is active.

Would you be able to get the latest changes and verify?

@XueSheng-GIT
Copy link

After applying the latest changes, webdav search fails (with and without File access control enabled). Reverting to the previous patch makes it work again.

Here's my search and the error message.

curl -u username:pass 'https://cloud.domain.invalid/remote.php/dav/' -X SEARCH -H "content-Type: text/xml" --data '<?xml version="1.0" encoding="UTF-8"?>
 <d:searchrequest xmlns:d="DAV:" xmlns:oc="http://owncloud.org/ns">
     <d:basicsearch>
         <d:select>
             <d:prop>
                 <oc:fileid/>
                 <d:displayname/>
                 <d:getcontenttype/>
                 <d:getetag/>
                 <oc:size/>
             </d:prop>
         </d:select>
         <d:from>
             <d:scope>
                 <d:href>/files/username</d:href>
                 <d:depth>infinity</d:depth>
             </d:scope>
         </d:from>
         <d:where>
             <d:like>
                 <d:prop>
                     <d:displayname/>
                 </d:prop>
                 <d:literal>search-string</d:literal>
             </d:like>
         </d:where>
         <d:orderby/>
    </d:basicsearch>
</d:searchrequest>'

Error message (response within the console):

<d:error xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns">
  <s:exception>TypeError</s:exception>
  <s:message>OC\Files\Cache\QuerySearchHelper::checkStorageAndPathFilter(): Argument #1 ($operator) must be of type OCP\Files\Search\ISearchBinaryOperator, OC\Files\Search\SearchComparison given, called in /var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php on line 109</s:message>
</d:error>

nextcloud.log

{"reqId":"GPoTv8fK8thbreYbQvtz","level":3,"time":"2023-07-12T20:15:42+02:00","remoteAddr":"fd64:c2a2:5c5e:e201:1a88:be31:f4b:6ec","user":"username","app":"webdav","method":"SEARCH","url":"/remote.php/dav/","message":"OC\\Files\\Cache\\QuerySearchHelper::checkStorageAndPathFilter(): Argument #1 ($operator) must be of type OCP\\Files\\Search\\ISearchBinaryOperator, OC\\Files\\Search\\SearchComparison given, called in /var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php on line 109","userAgent":"curl/7.81.0","version":"27.0.0.8","exception":{"Exception":"TypeError","Message":"OC\\Files\\Cache\\QuerySearchHelper::checkStorageAndPathFilter(): Argument #1 ($operator) must be of type OCP\\Files\\Search\\ISearchBinaryOperator, OC\\Files\\Search\\SearchComparison given, called in /var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php on line 109","Code":0,"Trace":[{"file":"/var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php","line":109,"function":"checkStorageAndPathFilter","class":"OC\\Files\\Cache\\QuerySearchHelper","type":"->"},{"file":"/var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php","line":129,"function":"generateStorageFilters","class":"OC\\Files\\Cache\\QuerySearchHelper","type":"->"},{"file":"/var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php","line":218,"function":"applySearchConstraints","class":"OC\\Files\\Cache\\QuerySearchHelper","type":"->"},{"file":"/var/www/nextcloud/lib/private/Files/Node/Folder.php","line":239,"function":"searchInCaches","class":"OC\\Files\\Cache\\QuerySearchHelper","type":"->"},{"file":"/var/www/nextcloud/apps/dav/lib/Files/FileSearchBackend.php","line":191,"function":"search","class":"OC\\Files\\Node\\Folder","type":"->"},{"file":"/var/www/nextcloud/apps/dav/lib/Files/LazySearchBackend.php","line":63,"function":"search","class":"OCA\\DAV\\Files\\FileSearchBackend","type":"->"},{"file":"/var/www/nextcloud/3rdparty/icewind/searchdav/src/DAV/SearchHandler.php","line":82,"function":"search","class":"OCA\\DAV\\Files\\LazySearchBackend","type":"->"},{"file":"/var/www/nextcloud/3rdparty/icewind/searchdav/src/DAV/SearchPlugin.php","line":119,"function":"handleSearchRequest","class":"SearchDAV\\DAV\\SearchHandler","type":"->"},{"file":"/var/www/nextcloud/3rdparty/sabre/event/lib/WildcardEmitterTrait.php","line":89,"function":"searchHandler","class":"SearchDAV\\DAV\\SearchPlugin","type":"->"},{"file":"/var/www/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":472,"function":"emit","class":"Sabre\\DAV\\Server","type":"->"},{"file":"/var/www/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":253,"function":"invokeMethod","class":"Sabre\\DAV\\Server","type":"->"},{"file":"/var/www/nextcloud/3rdparty/sabre/dav/lib/DAV/Server.php","line":321,"function":"start","class":"Sabre\\DAV\\Server","type":"->"},{"file":"/var/www/nextcloud/apps/dav/lib/Server.php","line":364,"function":"exec","class":"Sabre\\DAV\\Server","type":"->"},{"file":"/var/www/nextcloud/apps/dav/appinfo/v2/remote.php","line":35,"function":"exec","class":"OCA\\DAV\\Server","type":"->"},{"file":"/var/www/nextcloud/remote.php","line":172,"args":["/var/www/nextcloud/apps/dav/appinfo/v2/remote.php"],"function":"require_once"}],"File":"/var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php","Line":83,"message":"OC\\Files\\Cache\\QuerySearchHelper::checkStorageAndPathFilter(): Argument #1 ($operator) must be of type OCP\\Files\\Search\\ISearchBinaryOperator, OC\\Files\\Search\\SearchComparison given, called in /var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php on line 109","exception":[],"CustomMessage":"OC\\Files\\Cache\\QuerySearchHelper::checkStorageAndPathFilter(): Argument #1 ($operator) must be of type OCP\\Files\\Search\\ISearchBinaryOperator, OC\\Files\\Search\\SearchComparison given, called in /var/www/nextcloud/lib/private/Files/Cache/QuerySearchHelper.php on line 109"},"id":"64aeedd504caf"}

This test was done on NC 27.0.0.
@starypatyk, any idea why it does fail now?

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

Sorry, this was my mistake. 😞

I hope that I have corrected it in my last commit. If possible, please try again.

@XueSheng-GIT
Copy link

Thanks! A quick test was succesful with the updated pull and as fast as the previous approach. Tested with and without File access control.
I'm just on my mobile, thus a bit limited with testing. I will collect some logs today or tomorrow to verify that everything looks as expected.

@XueSheng-GIT
Copy link

XueSheng-GIT commented Jul 13, 2023

Here are the logs using the latest pull on NC 27.0.0 with File access control enabled (same webdav search as before #37061 (comment)).

pgsql query log:
webdav-search_patched-4.log

nextcloud log:
nextcloud_patched-4.log

Search speed is still amazing (same as with previous version of the pull). I was not able to see a difference with/without File access control enabled.

I've also applied this patch to one of my production systems (NC 26.0.3). At least initial testing went fine so far.

@starypatyk, thanks again for your effort! Hope that this will find its way to a stable release.
Just let me know if you need further testing/logs from my side.

EDIT:
Forgot to mention and maybe it's obvious, but unified search of the web interface is now fast too! This used to time out on my test instance (>60s). Now the results are shown within 1-2s.

@starypatyk
Copy link
Contributor Author

Hi @XueSheng-GIT,

Thanks for your testing and for the logs. 👍 Good to know that my changes help.

I now need to do some clean-up before submitting it for final review.

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
(cherry picked from commit 0485c07)
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
(cherry picked from commit 1cadbaf)
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk starypatyk force-pushed the dav-search-use-sql-in branch from 6075f81 to 2987c0d Compare July 14, 2023 20:19
@starypatyk
Copy link
Contributor Author

Hi @AndyScherzinger, @come-nc, @icewind1991,

After conversation with @XueSheng-GIT, I have updated this PR recently. It seems that these changes can provide even greater performance improvements is some cases - see e.g.: #35776 (comment)

Would you be able to review my changes again?

I have changed the approach slightly - currently I post-process the queries generated by getQueryFilterForStorage() calls.

I tried to put the optimization code in an additional QueryOptimizerStep in the existing QueryOptimizer code, but I was not happy with the result. The code to find the original storage/path conditions was quite awkward and complex. Doing this directly after obtaining the storageFilters seemed more natural:

$optimizedStorageFilters = $this->optimizeStorageFilters($storageFilters);

I do not plan any more changes, unless there are additional suggestions from the review.

There are a few additional topics that I think could be addressed in follow-up PRs:

Copy link
Contributor

@come-nc come-nc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also test the opposite case, for a search with not a lot of shared files, which is currently fast (<1s), does your PR add visible overhead?

$storage = $a->getValue();
/** @psalm-suppress UndefinedInterfaceMethod */
$path = $b->getValue();
\OC::$server->getLogger()->debug("QuerySearchHelper::checkStorageAndPathFilter: storage=" . $storage . " " . "path=" . $path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\OC::$server->getLogger()->debug("QuerySearchHelper::checkStorageAndPathFilter: storage=" . $storage . " " . "path=" . $path);
$this->logger->debug("QuerySearchHelper::checkStorageAndPathFilter: storage=" . $storage . " " . "path=" . $path);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to remove the logging. This was useful during development, quite unlikely to be useful in the future. Removed in f86dab9

// Create filters for single file shares
$singleFileFilters = [];
foreach ($storageToPathsMap as $storage => $paths) {
\OC::$server->getLogger()->debug("QuerySearchHelper::optimizeStorageFilters: storage=" . $storage . " " . "paths=" . implode(", ", $paths));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
\OC::$server->getLogger()->debug("QuerySearchHelper::optimizeStorageFilters: storage=" . $storage . " " . "paths=" . implode(", ", $paths));
$this->logger->debug("QuerySearchHelper::optimizeStorageFilters: storage=" . $storage . " " . "paths=" . implode(", ", $paths));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to remove the logging. This was useful during development, quite unlikely to be useful in the future. Removed in f86dab9

Signed-off-by: Dariusz Olszewski <starypatyk@users.noreply.github.com>
@starypatyk
Copy link
Contributor Author

Hi @come-nc,

Could you also test the opposite case, for a search with not a lot of shared files, which is currently fast (<1s), does your PR add visible overhead?

I have not been able to measure the overhead. In my "production" environment I have a lot of shared files, so this does not apply. When I tried to measure differences in my dev environment with no shared files, variation of timing between the runs (100-140 ms) was much higher than the difference between original code and my PR (~ 1 ms).

When there are no shared files the query should not be modified at all by my optimization code, so query execution time should not be affected.

Regarding time required to perform the optimization itself - this is an in-memory operation that should grow roughly linearly with the number of shares. So, even with thousands of shares should be negligible compared to the query execution time.

@icewind1991
Copy link
Member

I've started implementing something similar but at a level lower and more generalized: #40555

It applies similar transformations on the search operation before building the db query.

@starypatyk
Copy link
Contributor Author

@icewind1991

I've started implementing something similar but at a level lower and more generalized: #40555

It applies similar transformations on the search operation before building the db query.

Looks great. 👍 I will close this PR when #40555 is merged.

@blizzz blizzz added this to the Nextcloud 29 milestone Nov 23, 2023
@skjnldsv skjnldsv added 2. developing Work in progress and removed 3. to review Waiting for reviews labels Feb 23, 2024
@skjnldsv
Copy link
Member

skjnldsv commented Feb 23, 2024

Looks great. 👍 I will close this PR when #40555 is merged.

#40555 is merged.

@skjnldsv skjnldsv closed this Feb 23, 2024
@skjnldsv skjnldsv removed the 2. developing Work in progress label Feb 23, 2024
@skjnldsv skjnldsv deleted the dav-search-use-sql-in branch February 23, 2024 18:45
@skjnldsv skjnldsv removed this from the Nextcloud 29 milestone Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants