-
Notifications
You must be signed in to change notification settings - Fork 0
PyCon Sprint
Welcome to the BuddySuite sprint page for PyCon 2016. BuddySuite is a collection of four distinct modules designed to be 'one-stop-shop' command line tools for common biological data file manipulations (written in Python 3). To get an overview of the functionality, I recommend browsing the individual wikis for a couple of the tools (SeqBuddy, AlignBuddy, PhyloBuddy, and DatabaseBuddy).
I'm gearing this sprint to be beginner friendly, so even if you're not a computational biologist there will be plenty to sink your teeth into. Specific tasks are outlined below, but the primary goal will be to polish up the development branch for release of V1.2 by the end of PyCon.
Skill level: Beginner A great amount of care has been taken to build a comprehensive wiki for the BuddySuite, but documentation can almost always be made better!
Skill level: Beginner
Skill level: Intermediate (probably?) This is actually something I've considered doing for quite a while, but have been putting if off for the stupid reason that I just haven't spent the time to learn how to do it. It would be awesome if people could simply $pip install buddysuite
, so if you have experience managing software on PyPi, I would love the help on this task.
Skill level: Intermediate BuddySuite already talks to a server to report crashes and usage statistics (opt-in functionality), but it does not currently inform the user if their version is out of date.
Skill level: Intermediate/Advanced This is high priority and I plan on closing it out before the end of the sprint. The Suite has good unit test coverage, but there a few tricky places that need monkey patching and many places that need to be switched to a new resource management class. Details for unit tests can be found here.
Skill level: Beginner/Intermediate I'm toying with the idea of breaking a PEP8 rule by moving certain imports into certain functions... Some heavy modules are required by a limited number of functions, which is slooooowwwing down execution time of the whole program. This will require some systematic analysis of every function to determine which modules are required where, and then some benchmarking. I'm also open to fancier ideas if anyone has them.
Skill level: Advanced This would be an ambitious task, and may require more-than-passing familiarity with BioPython. SeqBuddy currently handles a decent number of file formats, but chokes on anything that involves per-residue annotation. If anyone thinks they may be interested in looking at this, please get in touch with me ahead of time (I'll do some extra prep-work to maximize the impact of your valuable time).
This project has been written in Python3 and is not backwards compatible with Python2. If Python3 is not currently installed on your system, I highly recommend using the free Anaconda manager from Continuum Analytics (if you experience any difficulty, click here). Alternatively, the software can be downloaded directly from the Python Software Foundation.
AlignBuddy and PhyloBuddy can be used to launch a number of third party alignment and tree building programs, but installation of these optional programs is up to you. For example, if you wish to use PhyloBuddy to build a phylogenetic tree with RAxML, you will first need to get RAxML into your system PATH.
All other dependencies come prepackaged with the installer, so you only need to worry about the following if you are using the unstable workshop version of BuddySuite.
The SeqBuddy blast, bl2seq, and purge functions require access to the blastp, blastn, and blastdbcmd binaries from the NCBI C++ toolkit. If not already in your PATH, SeqBuddy.py will attempt to download the binaries if any BLAST dependant functions are called. BioPython is used heavily by the entire suite; any version earlier than 16.6 will cause unit tests to fail. PhyloBuddy requires DendroPy and version 3.0 (beta) of the ETE toolkit.
The installer will only run on Mac and Linux. If you would like to try the BuddySuite on Windows, you will need to install the development version (see below).
Download the graphical installer and run it from the command line
$: cd /path/to/download/folder
$: chmod +x BuddySuite.py
$: ./BuddySuite.py
By default, the installer will create short-form symbolic links for the main tools in your PATH ('sb' for SeqBuddy, 'alb' for AlignBuddy, 'pb' for PhyloBuddy, and 'db' for DatabaseBuddy), so they can be accessed quickly (examples in the wiki use these short forms). The full names of each tool will also be added to PATH. If working outside the context of a graphical OS (on a cluster, for example), the installer will run in command-line mode (also accessible with the -cmd flag on graphical systems, if you prefer that).
Once the BuddySuite moves out of beta, the installer will only bundle stable release versions of the BuddySuite. If bugs are found they will be fixed, but the expected behavior will not be changed once the release is finalized. Likewise, new features added to the development versions will not become available in the installer until the next release. Versions of each tool or the installer can be displayed using the -v flag.
Once installed, you can access the modules from the command line using their full names:
$: SeqBuddy -h
Or the shortcuts created by the installer:
$: sb -h
For a detailed breakdown of the tools available within each module, check out the BuddySuite wiki.
The easiest way to get the development version up and running is to clone/fork the repository.
$: git clone https://github.com/biologyguy/BuddySuite.git
Then move into the repo and switch to the 'development' branch:
$: cd BuddySuite
$: git checkout develop
All of the individual Buddy toolkits are located in the 'workshop' directory. The 'development' branch is where all new features are created and tested, so things may be less stable here; it's usually pretty solid though. If you're interested in contributing to the project, please ensure you are working from this branch.
See the developer page for further information on development version dependencies and how to contribute to the project.
This project has been maturing for about a year and a half now, and is in a very usable state
- Migrate all of SeqBuddy tests to using the new resources class
- Add function to check for new version, download, and install.