Parallel processing pr #290

timonmerk · 2024-01-27T15:12:33Z

Adding joblib feature parallelization.

Added check if parallelization can be performed in nm_stream_offline
Changed class attribute to instance attribute in nm_mne_connectivity
Added dependency of joblib
Added Tests for equal sequential and parallel computation

…ltiprocessing with fail due to pickeling

timonmerk · 2024-01-27T15:18:44Z

@toni-neurosc I worked on a different branch where I merged some of your commits from #281.

Thanks for adapting the PR!
The model.predict is a basic (not really tested) method to load a ML model and get directly predictions that could be used for real-time applications. It was however badly implemented, and not tested, so I removed it for now.

The real-time computation also one of the reasons why I put the computation in this strange while True loop, s.t. a datastream could replace the nm_generator and constantly feeds data until a "stop signal" is reached.

Optimally this would be replaced by a mne-lsl Stream at some point (#278, https://mne.tools/mne-lsl/stable/generated/tutorials/00_stream.html#sphx-glr-generated-tutorials-00-stream-py).

Regarding the parallelization implementation, I found out that there are some issues with parallelization of certain feature estimation and post-processing methods. Currently normalization for example is implemented sequentially, which means that batches need to be iteratively computed. The same is true for bursts, since there also a time window (of e.g. 30s) has to be defined and then iteratively processed.

I added now a check for those methods in nm_stream_offline and throw an exception if those are selected when parallel is set to True.

Personally I am also a bit more in favor of joblib across OS systems, which simplified the code a bit. So for now I removed the OS check.

Thanks again! Such a great addition to the tool

toni-neurosc · 2024-01-27T18:18:27Z

Hi Timon! Thanks for your comments, those are some good points that I did not consider, partly cause the parallelization is one of the first things that I tried to do and did not have a proper full-picture view of the software (still don't have it).

I indeed notice the issue with parallelization while working on nm_bursts, since the circular buffer persists across batches and it's used to update the percentile. There is however a possibility to maintain the parallelization if the update of the buffer and the calculation of percentile is done sequentially, and then the processing of the batch is submitted to the worker pool with a frozen copy of the percentile. This logic could extend to the normalization but I'd have to check.

Of course this requires a little more low-level knowledge of how the sub-processes are being spawn and which data is being copied between processes and which one is just referencing other blocks of memory which could change during execution. For example, if a class variable self.burst_thr is calculated and stored, and then the job is submitted, will that burst_thr be copied into the new process, or when the next batch comes and updates it, will it affect already running processes? I need to research or write a test case for it.

toni-neurosc and others added 9 commits January 16, 2024 13:20

Add parallel batch processing

75feba8

add joblib parallel feature computation

460b986

Add Joblib as multiprocessing backend for Windows/MacOS

bbe3cd2

Merge branch 'main' into parallel_processing_pr

a45c383

change coherence_objects from class to object attribute, otherwise mu…

ad5b825

…ltiprocessing with fail due to pickeling

adapt Exception message to new setting keyword

6b3aa63

add joblib parallelization

0bd3835

merge changes

7fdc2a8

merge changes

f349ec1

timonmerk merged commit dcb981b into main Jan 27, 2024
4 checks passed

timonmerk mentioned this pull request Jan 27, 2024

Add parallel batch processing #281

Closed

timonmerk deleted the parallel_processing_pr branch February 3, 2024 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel processing pr #290

Parallel processing pr #290

timonmerk commented Jan 27, 2024

timonmerk commented Jan 27, 2024

toni-neurosc commented Jan 27, 2024

Parallel processing pr #290

Parallel processing pr #290

Conversation

timonmerk commented Jan 27, 2024

timonmerk commented Jan 27, 2024

toni-neurosc commented Jan 27, 2024