Multiprocessing on moods_dna.py #13

vmesel · 2017-02-08T13:11:07Z

As I'm running MOODS to search for multiple motifs on 20304 sequences, I would like to see things running quicker. Is there any way to activate multiprocessing on moods_dna.py ?

jhkorhonen · 2017-02-08T13:17:10Z

No, sorry, you'll currently have to run parallel instances of moods_dna.py manually if you want to use multiple cores.

However, first make sure that you are using the --batch option. This will have a dramatic effect on performance with lots of sequences. (See also what --help say about the --bg option.)

vmesel · 2017-02-08T13:22:31Z

--batch made it run faster but I still need more velocity on the process. Running one motif on each core should solve the problem on multiple motifs, but I will need to adapt all my code! No future plans on parallelizing this pipeline?

jhkorhonen · 2017-02-08T13:28:44Z

Sorry, I will not have time to do that in the near future.

It's likely more efficient to parallelise in terms of sequences, given what MOODS does. I would just suggest splitting your sequence set into M equal-size sets, where M is the number of cores, and calling moods_dna.py in parallel for each of the M sequence sets, using the full motif set each time.

jhkorhonen added the question label Feb 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiprocessing on moods_dna.py #13

Multiprocessing on moods_dna.py #13

vmesel commented Feb 8, 2017

jhkorhonen commented Feb 8, 2017

vmesel commented Feb 8, 2017

jhkorhonen commented Feb 8, 2017

Multiprocessing on moods_dna.py #13

Multiprocessing on moods_dna.py #13

Comments

vmesel commented Feb 8, 2017

jhkorhonen commented Feb 8, 2017

vmesel commented Feb 8, 2017

jhkorhonen commented Feb 8, 2017