`sparse_mean_variance_axis` now uses all cores #3015

Intron7 · 2024-04-19T08:22:06Z

This functions now uses all cores for mean & var calculations.

codecov · 2024-04-19T08:39:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.27%. Comparing base (ee8505b) to head (d982936).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3015      +/-   ##
==========================================
+ Coverage   75.53%   76.27%   +0.74%     
==========================================
  Files         117      117              
  Lines       12950    12795     -155     
==========================================
- Hits         9782     9760      -22     
+ Misses       3168     3035     -133

Files	Coverage Δ
scanpy/preprocessing/_utils.py	`97.36% <100.00%> (+42.70%)`	⬆️

... and 4 files with indirect coverage changes

flying-sheep

Really cool!

Please use longer variable names. nthr→num_threads, s→sums_minor or so, s0→???, m0→???

Of course not for things like loop counters, but you know.

Also would be great to have comments, like “calculate sums and sum of squares along the minor axis” (if “minor” is correct here) then “go over minor axis again to calculate means and variances from the sums”.

PS: Benchmarks should be ready in a little bit, so you could add one to check how much this speeds things up! I’ll keep you posted

Intron7 · 2024-04-19T08:56:48Z

Some small benchmarks for 32 cores with CSR.shape=(196943, 20867):

axis	old	new
minor	804 ms	96 ms
major	520 ms	40 ms

scverse-benchmark · 2024-04-19T10:26:21Z

Benchmark changes

Change	Before [`ee8505b`]	After [`d982936`]	Ratio	Benchmark (Parameter)
-	508±2ms	31.9±1ms	0.06	preprocessing.SparseDenseSuite.time_mean_var('lung93k')
+	1.09±0.04ms	1.22±0.04ms	1.12	preprocessing.SparseDenseSuite.time_mean_var('pbmc68k_reduced')
+	241M	330M	1.37	preprocessing.peakmem_pca
+	5.86±0.01ms	6.86±0.03ms	1.17	preprocessing.time_calculate_qc_metrics

Comparison: https://github.com/scverse/scanpy/compare/ee8505b1c1578af0c50defdb3cf64ec18713669e..d9829365d7e23aea5680990ea8570d0a384291d3
Last changed: Tue, 23 Apr 2024 09:27:27 +0000

More details: https://github.com/scverse/scanpy/pull/3015/checks?check_run_id=24144155267

flying-sheep · 2024-04-19T10:54:58Z

I tentatively added a benchmark that runs just on _get_mean_var.

Locally I don’t see any difference though, what’s wrong? Too small data? Numba not set up with correct number of threads?

/edit: also I think the machine is not sufficiently tuned. The original run (before I added the mean_var benchmarks) said “No changes in benchmarks.”

flying-sheep

Looks good! Would be nice to add a sufficiently large dataset to the benchmarks to demonstrate the benefits.

…ses all cores) (#3024) Co-authored-by: Severin Dicks <37635888+Intron7@users.noreply.github.com>

Intron7 added 2 commits April 19, 2024 10:18

updates the kernels to work in parallel

fee92b9

update a remove a copy

528981b

Intron7 added this to the 1.10.2 milestone Apr 19, 2024

Intron7 added the Area – Performance 🐌 label Apr 19, 2024

Intron7 linked an issue Apr 19, 2024 that may be closed by this pull request

Update Preprocessing functions with numba #3011

Closed

Intron7 requested a review from ivirshup April 19, 2024 08:25

adds releasenote

316799b

flying-sheep requested changes Apr 19, 2024

View reviewed changes

Intron7 added 2 commits April 19, 2024 10:43

updates indptr

dc21f84

update names

c33d59c

Intron7 requested a review from flying-sheep April 19, 2024 08:51

Intron7 and others added 3 commits April 19, 2024 11:08

update n_threads

f117b82

Merge branch 'main' into _mean_var-sparse-mc

d9b638e

remove double R conv

3e8fdb4

flying-sheep added the benchmark label Apr 19, 2024

add sparse dense suite

9764f3e

exclude numba from coverage

41c97db

flying-sheep approved these changes Apr 19, 2024

View reviewed changes

Add big dataset

45f85ce

flying-sheep added benchmark and removed benchmark labels Apr 23, 2024

Merge branch 'main' into _mean_var-sparse-mc

d982936

flying-sheep merged commit a70582e into main Apr 23, 2024
15 checks passed

flying-sheep deleted the _mean_var-sparse-mc branch April 23, 2024 15:00

meeseeksmachine pushed a commit to meeseeksmachine/scanpy that referenced this pull request Apr 23, 2024

Backport PR scverse#3015: sparse_mean_variance_axis now uses all cores

13d6968

meeseeksmachine mentioned this pull request Apr 23, 2024

Backport PR #3015 on branch 1.10.x (sparse_mean_variance_axis now uses all cores) #3024

Merged

flying-sheep pushed a commit that referenced this pull request Apr 23, 2024

Backport PR #3015 on branch 1.10.x (sparse_mean_variance_axis now u…

eb077e3

…ses all cores) (#3024) Co-authored-by: Severin Dicks <37635888+Intron7@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`sparse_mean_variance_axis` now uses all cores #3015

`sparse_mean_variance_axis` now uses all cores #3015

Intron7 commented Apr 19, 2024

codecov bot commented Apr 19, 2024 •

edited

Loading

flying-sheep left a comment •

edited

Loading

Intron7 commented Apr 19, 2024

scverse-benchmark bot commented Apr 19, 2024 •

edited

Loading

flying-sheep commented Apr 19, 2024 •

edited

Loading

flying-sheep left a comment •

edited

Loading

sparse_mean_variance_axis now uses all cores #3015

sparse_mean_variance_axis now uses all cores #3015

Conversation

Intron7 commented Apr 19, 2024

codecov bot commented Apr 19, 2024 • edited Loading

Codecov Report

flying-sheep left a comment • edited Loading

Choose a reason for hiding this comment

Intron7 commented Apr 19, 2024

scverse-benchmark bot commented Apr 19, 2024 • edited Loading

Benchmark changes

flying-sheep commented Apr 19, 2024 • edited Loading

flying-sheep left a comment • edited Loading

Choose a reason for hiding this comment

`sparse_mean_variance_axis` now uses all cores #3015

`sparse_mean_variance_axis` now uses all cores #3015

codecov bot commented Apr 19, 2024 •

edited

Loading

flying-sheep left a comment •

edited

Loading

scverse-benchmark bot commented Apr 19, 2024 •

edited

Loading

flying-sheep commented Apr 19, 2024 •

edited

Loading

flying-sheep left a comment •

edited

Loading