-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelise duckdb resulting in e.g. 2-4x speedup on 6 core machine #1796
Merged
Merged
Changes from 10 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
ed3d2de
parallelise duckdb
RobinL 9208d74
order by 1 to hint parallelisation in best place
RobinL ef8ec92
fix convergence test
RobinL a75af8d
doens't seem to improve things
RobinL 32669fb
4 partitions
RobinL 6aeb659
works
RobinL 0c096fa
fix
RobinL 1307c39
Merge pull request #1800 from moj-analytical-services/faster_duckdb_t…
RobinL 90f6c43
scale salting by data
RobinL 231d00f
fix convergence test
RobinL 398d1ba
Merge branch 'master' into faster_duckdb
RobinL dec3078
Add duckdb salting based on max_pairs
RobinL 1d8b64b
Refactor _get_duckdb_salting to double the returned value
RobinL 0c46720
revert change that doubled cpus. was only used for benchmarking
RobinL 0b95034
Refactor blocking and prediction SQL queries
RobinL fc683ed
Remove unnecessary blank line in SaltedBlockingRule class
RobinL 0bca602
Update estimate_u.py: Import multiprocessing and remove unused function
RobinL 7168154
Update cvv_hashed_tablename in test_correctness_of_convergence.py
RobinL cc3ac68
Update changelog
RobinL File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a heuristic to make duckdb parallelise more when the user is asking for a bigger computation