Skip to content
Steve Bond edited this page Jun 2, 2016 · 40 revisions

___ ## A friend to help manage your columns and rows AlignBuddy is a command line program and Python3 API for quickly and easily reading, writing, analyzing, and manipulating alignment files in common formats including NEXUS, PHYLIP, and Stockholm. There is an emphasis on simplicity and interoperability, as formats are automatically detected and the input can be file paths, handles, or pipes. The AlignBuddy tools can be broadly grouped into two classes; tools that manipulate your data and return a new alignment and tools that perform some analysis and return a non-alignment result. Each of the tools currently implemented in the command line UI have been documented in these wiki-pages, including use cases to demonstrate the tools in action. The flags chosen are hopefully rational, and care has been taken to minimize the number of positional arguments to make the learning curve as shallow as possible.

Command line sequence manipulation tools

Functions

Function Flag Parameters Brief Description
alignment_lengths -al None Returns a list of alignment lengths.
back_transcribe -r2d None Convert RNA alignments to DNA
bootstrap -bts [number bootstraps (int)] Create bootstrap alignment(s)
clean_seq -cs ['strict'][replacement char] Strip out ambiguous and/or non-sequence characters
concat_alignments -cta <regex> Concatenates two or more alignments by splitting and matching the sequence identifiers
consensus -con None Condense alignments into simple majority-rule consensus sequences
delete_records -dr <regex> [regex] [columns (int)] Remove selected rows from alignments
enforce_triplets -et None Shift all gaps so that all sequences are organized by triplet
extract_regions -er <positions (str)> [positions] ... Pull out sub-sequences
generate_alignment -ga [align_tool] [args] Create a new alignment from unaligned sequences
hash_ids -hi [hash length (int)] Rename all identifiers to random hashes
list_ids -li [columns (int)] Output list of all sequence identifiers
lowercase -lc None Convert all sequences to lowercase
num_seqs -ns None Count sequences in each alignment
order_ids -oi ['rev'] Sort sequences by ID in alpha-numeric order (specify 'rev' to reverse order)
pull_records -pr <regex> [regex] ['full'] Extract selected rows from alignments
rename_ids -ri <regex> <subst string> [num] Replace some pattern in IDs with something else
screw_formats -sf <new format> Change the file format to something else
split_to_files -stf <out dir> [prefix] Put each alignment into its own file
transcribe -d2r None Convert DNA alignments to RNA
translate -tr None Convert coding sequences into amino acids
trimal -trm [{Threshold (int or float) | "all" | "clean" | "gappyout"}] Trim away columns containing gaps based on a threshold value or one of the automatic methods (default=gappyout)
uppercase -uc None Convert all sequences to uppercase

Modifying flags

Flag Brief Description
-f --format Force read a specific BioPython format. This may allow you to use some of the tools on some formats not auto-read by AlignBuddy, but no promises.
-i --in_place Rewrites the FIRST input file with the final output. Be careful!
-k --keep_temp Specify a directory to store files produced by an alignment tool (for generate_alignment)
-o --out_format Specify the format you want the output returned in.
-q --quiet Suppress stderr messages.
-t --test Run the function and return any stderr/stdout other than the alignment

Main Toolkit Pages





Further Reading

Clone this wiki locally