Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration path from v1.0.2.1 to v1.9.4.1 #31

Closed
michael-imbeault opened this issue Jul 2, 2020 · 7 comments
Closed

Migration path from v1.0.2.1 to v1.9.4.1 #31

michael-imbeault opened this issue Jul 2, 2020 · 7 comments
Labels

Comments

@michael-imbeault
Copy link

I have old scripts using MOODS that I need to update and noticed that the API for MOODS changed significantly between v1 and v1.9. The documentation for the new version is however almost non-existant.

I was using only 2 functions:
MOODS.max_score()

and

MOODS.search(sequence, matrix_list, threshold_list, q=q, bg=bg, absolute_threshold=True, both_strands=True)

What would be the equivalent calls for v1.9?

@michael-imbeault
Copy link
Author

For reference, the python docs for version 1.0 were very complete - is there something equivalent for v1.9?

https://www.cs.helsinki.fi/group/pssmfind/doc/python/MOODS.html

@jhkorhonen
Copy link
Owner

The documentation is on Github wiki: https://github.com/jhkorhonen/MOODS/wiki

See also the scripts/ folder for examples on how to use the current interface. In particular, ex-basic-usage.py should cover everything you need. (Note that there is no both_strands=True equivalent option, so you'll need to handle this by including reverse complement matrices in the scan, as show in the example.)

@michael-imbeault
Copy link
Author

Yes I had found the documentation - a lot of sections are 'in construction' however, and there's no clear python API described, so I had to go have a look at the C++ code to see what parameters the functions could take. Thanks for linking the wiki directly - it would help to link https://github.com/jhkorhonen/MOODS/wiki/Function-reference directly from the readme on github, right now there's only links to 'getting started' and 'installation and I didn't see this section.

As for the both_strands scanning, how was 1.0 handling it? Just reverse complementing matrices internally, or somehow scanning both strands at once? How is performance of 1.9 vs 1.0 in that case (would be the same, or slower?).

Also it would have been nice if no API breaking occurred - I have no problem about functions now being in submodules, but losing parameters / functionality will make this upgrade a bit more complicated than I anticipated (regarding the loss of both_strands=True as a parameter).

@michael-imbeault
Copy link
Author

michael-imbeault commented Jul 6, 2020

Just to be clear where my confusion is coming from - issues such as #29 seemingly implying there is integrated reverse complement search functionality, and from the basic usage examples:

https://github.com/jhkorhonen/MOODS/blob/master/python/scripts/ex-basic-usage.py

#separate reverse complements and the non-reverse complements
fr = results[:len(matrix_names)]
rr = results[len(matrix_names):]

#mix the results together, use + and - to indicate strand
results = [ [(r.pos, r.score, '+') for r in fr[i]] + [(r.pos, r.score, '-') for r in rr[i]] for i in xrange(len(matrix_names))]

@michael-imbeault
Copy link
Author

Ok had a look at 1.0 code, both_strands is nothing fancy so I can write a wrapper to do the same - still puzzled why this was left out as an option since its so simple / handy. Also puzzled by the inconsistent docs (I think there's a mix of information between versions).

@jhkorhonen
Copy link
Owner

Yes, the complement strand scanning is just adding the reverse complement matrices to the matrix set, and this was the implementation in 1.0 also. The moods-dna.py script/command line tool has various options for this, which is what the issues are referring to.

The 1.0 API was a mess, and doing some API breaking was necessary anyways, so I decided to just rip off the band-aid and make a more sane API from ground up back when I did the big 1.9 update. The lack of search equivalent is certainly unfortunate, but doing it correctly is surprisingly non-trivial—the main concerns are that thresholds can be different on the reverse complement strand depending on your scoring method, and that maintaining search equivalents for all the different ways to access the scanning machinery is a lot of code bloat.

Sadly I've since moved away from bioinformatics, so finishing the documents and other MOODS stuff gets buried under everything else very easily...

@michael-imbeault
Copy link
Author

No worries, I got everything I needed from looking at the code :) Great library with amazing performance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants