Parallel read and preprocess the data #371

xiki-tempula · 2024-05-29T19:30:40Z

Use joblib to parallelise the read and preprocess.

…rkflow

…tps://github.com/alchemistry/alchemlyb into 359-speed-up-the-readpreprocess-in-abfe-workflow

codecov · 2024-05-29T20:04:23Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.78%. Comparing base (0093905) to head (c6cb745).
Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #371   +/-   ##
=======================================
  Coverage   98.78%   98.78%           
=======================================
  Files          28       28           
  Lines        1978     1982    +4     
  Branches      435      436    +1     
=======================================
+ Hits         1954     1958    +4     
  Misses          2        2           
  Partials       22       22

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

orbeckst

Overall this looks like a powerful new feature for the workflow. Do you have some simple performance benchmarks?

My primary concerns are (see comments)

Making sure that joblib is explicitly installed.
Is the new default n_jobs=-1 safe?
add docs

CHANGES

environment.yml

orbeckst · 2024-10-02T17:14:50Z

src/alchemlyb/tests/test_workflow_ABFE.py

 import pytest
 from alchemtest.amber import load_bace_example
 from alchemtest.gmx import load_ABFE
+from joblib import parallel_config


Maybe just import joblib so that it's clearer below what comes from joblib? In this way any un-qualified functions and classes are alchemlyb and everything else is external.

orbeckst · 2024-10-02T17:15:33Z

src/alchemlyb/tests/test_workflow_ABFE.py

+            suffix="xvg",
+            T=310,
+        )
+        with parallel_config(backend="threading"):


I find

Suggested change

with parallel_config(backend="threading"):

with joblib.parallel_config(backend="threading"):

clearer when quickly reading the code.

src/alchemlyb/workflows/abfe.py

orbeckst · 2024-10-02T17:16:51Z

src/alchemlyb/workflows/abfe.py

@@ -115,7 +116,7 @@ def __init__(
        else:
            raise NotImplementedError(f"{software} parser not found.")

-    def read(self, read_u_nk=True, read_dHdl=True):
+    def read(self, read_u_nk=True, read_dHdl=True, n_jobs=-1):


Is default -1 really always the best choice? Did you try on a machine with, say, 16 cores, and 16 hyperthreaded cores (or really anything with hyperthreads)?

I am willing to make -1 the default if this is not throwing surprises for users. Otherwise the conservative 1 would be better and users can then explicitly enable.

add versionchanged to docs

Add a paragraph about parallelization: how to enable it, what it does (for each file), any potential problems...

orbeckst · 2024-10-02T17:24:58Z

src/alchemlyb/workflows/abfe.py

@@ -201,6 +219,7 @@ def run(
        overlap="O_MBAR.pdf",
        breakdown=True,
        forwrev=None,
+        n_jobs=-1,


see above

Is -1 safe as new default?

add docs (explanation)

add versionchanged

orbeckst · 2024-10-02T17:25:44Z

src/alchemlyb/workflows/abfe.py

@@ -307,7 +329,7 @@ def update_units(self, units=None):
            logger.info(f"Set unit to {units}.")
            self.units = units or None

-    def preprocess(self, skiptime=0, uncorr="dE", threshold=50):
+    def preprocess(self, skiptime=0, uncorr="dE", threshold=50, n_jobs=-1):


see above

Is -1 safe as new default?

add docs (explanation)

add versionchanged

…rkflow

orbeckst

Thanks for addressing all my comments and especially the extensive docs.

xiki-tempula added 4 commits May 27, 2024 21:30

update

da36cff

update

5ed0ce5

update

d348e76

update

fda5ef7

xiki-tempula linked an issue May 29, 2024 that may be closed by this pull request

Speed up the read/preprocess in ABFE workflow #359

Closed

xiki-tempula and others added 5 commits May 29, 2024 20:30

Merge branch 'master' into 359-speed-up-the-readpreprocess-in-abfe-wo…

fd52924

…rkflow

update

0e089a6

Merge branch '359-speed-up-the-readpreprocess-in-abfe-workflow' of ht…

165994c

…tps://github.com/alchemistry/alchemlyb into 359-speed-up-the-readpreprocess-in-abfe-workflow

update

3fa533a

fix test

77eee59

xiki-tempula added 2 commits May 29, 2024 21:12

update

acae6b5

update

6b82e09

xiki-tempula marked this pull request as ready for review May 29, 2024 20:34

make test more clear

d5fe5dd

xiki-tempula requested a review from orbeckst May 29, 2024 20:43

xiki-tempula force-pushed the 359-speed-up-the-readpreprocess-in-abfe-workflow branch from 0f9ddae to d5fe5dd Compare June 3, 2024 09:03

fix type

54bb316

orbeckst added enhancement parsers preprocessors labels Sep 19, 2024

orbeckst requested changes Oct 2, 2024

View reviewed changes

orbeckst self-assigned this Oct 2, 2024

xiki-tempula added 2 commits October 5, 2024 14:05

Merge branch 'master' into 359-speed-up-the-readpreprocess-in-abfe-wo…

b36bf1e

…rkflow

update

c6cb745

xiki-tempula requested a review from orbeckst October 5, 2024 15:47

orbeckst approved these changes Oct 6, 2024

View reviewed changes

orbeckst merged commit 30ecce0 into master Oct 6, 2024
7 of 8 checks passed

orbeckst deleted the 359-speed-up-the-readpreprocess-in-abfe-workflow branch October 6, 2024 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel read and preprocess the data #371

Parallel read and preprocess the data #371

xiki-tempula commented May 29, 2024 •

edited

Loading

codecov bot commented May 29, 2024 •

edited

Loading

orbeckst left a comment

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst Oct 2, 2024

orbeckst left a comment

	with parallel_config(backend="threading"):
	with joblib.parallel_config(backend="threading"):

Parallel read and preprocess the data #371

Parallel read and preprocess the data #371

Conversation

xiki-tempula commented May 29, 2024 • edited Loading

codecov bot commented May 29, 2024 • edited Loading

Codecov Report

orbeckst left a comment

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst Oct 2, 2024

Choose a reason for hiding this comment

orbeckst left a comment

Choose a reason for hiding this comment

xiki-tempula commented May 29, 2024 •

edited

Loading

codecov bot commented May 29, 2024 •

edited

Loading