Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiprocessing on moods_dna.py #13

Open
vmesel opened this issue Feb 8, 2017 · 3 comments
Open

Multiprocessing on moods_dna.py #13

vmesel opened this issue Feb 8, 2017 · 3 comments
Labels

Comments

@vmesel
Copy link

vmesel commented Feb 8, 2017

As I'm running MOODS to search for multiple motifs on 20304 sequences, I would like to see things running quicker. Is there any way to activate multiprocessing on moods_dna.py ?

@jhkorhonen
Copy link
Owner

No, sorry, you'll currently have to run parallel instances of moods_dna.py manually if you want to use multiple cores.

However, first make sure that you are using the --batch option. This will have a dramatic effect on performance with lots of sequences. (See also what --help say about the --bg option.)

@vmesel
Copy link
Author

vmesel commented Feb 8, 2017

--batch made it run faster but I still need more velocity on the process. Running one motif on each core should solve the problem on multiple motifs, but I will need to adapt all my code! No future plans on parallelizing this pipeline?

@jhkorhonen
Copy link
Owner

Sorry, I will not have time to do that in the near future.

It's likely more efficient to parallelise in terms of sequences, given what MOODS does. I would just suggest splitting your sequence set into M equal-size sets, where M is the number of cores, and calling moods_dna.py in parallel for each of the M sequence sets, using the full motif set each time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants