-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Use duckdb as backend instead of sqlite #59
Conversation
Resolves issue #58 |
It looks like duckdb isn't consistent about the ordering even when it's called the same way, I think that's going to cause bugs in peoples code. I'd suggest that we order the results before returning them. I also want to discuss scope of this a bit, but will continue that in #58 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I have a question about what the right way to order is, but we could probably merge this before figuring that out.
Do the bioconductor packages enforce a particular ordering?
@@ -295,7 +308,7 @@ def _build_query( | |||
else: | |||
query = self.db.table(table) | |||
# add filter | |||
query = query.filter(filter.convert()).select(cols) | |||
query = query.filter(filter.convert()).select(cols).order_by(cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nvictus does this ordering look reasonable? Or is there a specific thing we should be ordering by?
I'm going to merge this an open an issue to discuss ordering |
This PR adds a
backend
(default: sqlite, optional: duckdb). Also it forces ordering by columns requested in order.Benchmarking:
Running tests using DuckDB backend
pytest 40.40s user 7.76s system 196% cpu 24.544 total
Running tests using Sqlite3
pytest 39.02s user 10.77s system 97% cpu 51.043 total
DuckDB is twice as fast because it used two cores, but about the same number of cpu cycles.