-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pagination support to Files Source plugins #18059
Add pagination support to Files Source plugins #18059
Conversation
dbfcc6f
to
1c0988f
Compare
Looking great so far! |
591200e
to
d3cef4e
Compare
f128313
to
b614ee6
Compare
I guess this test failure:
Is caused either by an upstream implementation error or by this known limitation of GCSFS https://fs-gcsfs.readthedocs.io/en/stable/#limitations As a workaround, I will catch the error and default to 0 as directories are not a "real" thing in Google Cloud Storage or try to call UpdateUsing |
This should be finally ready for review. The remaining converter and integration test failures seem unrelated. |
This probably never worked as the information required to serialize the files was not there.
This is necessary to handle pagination properly in the UI.
This will allow to decouple the total matches from the actual returned items from pagination.
Replace unused tuple value with `_` Co-authored-by: John Davis <jdavcs@gmail.com>
8f8c885
to
2b2d9b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you for addressing all the comments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks amazing! I'm glad it got into the release - very nice work.
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
Sentry, you are so cool! 😆 This seems like a user configuration error:
It probably should be a 400 instead of a 500 though. I wonder how we could discriminate better between those here 🤔 |
I'm not sure ... in any case eu isn't running 24.1 yet, so this PR is unrelated. If this is an admin config that is wrong it seems correct to raise that error, if this is a user than I agree we don't need to make this a 500 ... |
Ahh! You're right! I didn't notice it was on EU. I thought it was on test with someone testing the new user-defined file sources. Then this does look indeed like a typo or maybe the bucket was renamed. https://registry.opendata.aws/noaa-ufs-s2s/ |
Currently, when listing contents in a remote file source, we retrieve all possible files and directories inside, then perform client-side pagination and search. While this approach is enough for small sources with a limited number of files, larger repositories could potentially contain a vast number of files, making navigation within these sources very slow and difficult.
This addresses the issue by passing limit and offset parameters to paginate the contents. However, this approach comes with its own drawbacks. Specifically, we must carefully handle sorting, filtering, and searching server-side, which may prove challenging depending on the plugin implementation. Also, pagination is only applicable in a "non-recursive" view of the file source.
The pull request includes the following changes:
supports_pagination
. This property determines if the plugin can paginate the contents server-side so the client can delegate that.TempFileSource
, which simplifies testing of PyFilesystem2-based plugins. This plugin utilizes the built-in OS Filesystem. I wonder if we could even replace theposix
plugin with this... unless there are specific functionalities unique to theposix
plugin 🤔TempFileSource
plugin.supports_search
. This property determines if the plugin can filter results by a search query. The syntax of the query is up to the plugin implementation, but a basic "search by name" should be supported as a minimum.supports_sorting
. Most plugins (if any) won't support this, but it can signal the client to disable sorting in the UI when pagination is supported but server-side sorting is not, so the listings are always in sync.TODO
total_matches
as an additional return value when listing contents.How to test the changes?
License