-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using -super5 with Muscle3D #85
Comments
# for up to ~10,000 structures reseek -convert STRUCTS -bca structs.bca reseek -pdb2mega structs.bca -output structs.mega reseek -distmx structs.bca -output structs.distmx muscle -super7 structs.mega -distmxin structs.distmx -reseek -output structs.afa I haven't tried 90k structures, I think a good chance it will work though the alignment might be better if you cluster the structures first. Reseek has an undocumented clustering command -- if you want to give that a try let me know & I'll sketch out how to use it. The usage message given by muscle does not explain this and the documentation at the web site does not mention structure at all yet -- the documentation could certainly be improved here. If you find the 90k alignment useful, I'd be interested to learn more, maybe you could email me? |
Hey @rcedgar, thank you so much for the super-fast reply! Sorry, I completely missed that part of the README. I think I was too excited to try it out and jumped straight to launching it! I will try again and let you know if it worked. For now, I'm interested in having the 90k structure-based alignment to compare it with a sequence-based alignment. If that doesn't work, I will try with reseek clustering first. I already compared Foldmason vs Muscle3D with a smaller set and Muscle worked way better for me. I will definitely send you an email with more information about the project in case you are interested! Best, |
Hi @rcedgar I tried again following these commands:
And I get the following error: ---Fatal error--- |
This means that the structures are very divergent, I think it's unlikely you will be able to make a meaningful MStA here. Happy to discuss further if you send me an email. |
Hi @rcedgar, I've checked and the distance matrix I've sent you a mail with some more info! |
This seems to be an issue with the conda build, close as resolved? |
Hi Robert!
First of all, thanks you so much for your time dedicated to software development and making bioinformaticians lives easier!
I'm trying to aling around 90k PDBs from Alphafold using Muscle 3D, following this commands in a machine with 2TB of RAM and 128 threads.
reseek -pdb2mega second_round/ -output second_round.mega && muscle -super5 second_round.mega -output second_round.afa
However, I always get the following error:
Mega::GetProfileByLabel(Cluster2)
with different cluster numbers depending on the run.Is
-super5
compatible with Muscle3D? When I run-align
with >1k sequences, I get the warning>1k sequences, may be slow or use excessive memory, consider using -super5
I also tried with smaller alignments (~200 PDBs) and I also get the same error. When I try with
-align
instead of -super5 everything works nice!Is it advisable to run Muscle 3D with >90k sequences without the
-super5
command? What should be the best strategy for this?Thanks for your time,
Mario
The text was updated successfully, but these errors were encountered: