AlignBuddy

___ ## A friend to help manage your columns and rows AlignBuddy is a command line program and Python3 API for quickly and easily reading, writing, analyzing, and manipulating alignment files in common formats including NEXUS, PHYLIP, and Stockholm. There is an emphasis on simplicity and interoperability, as formats are automatically detected and the input can be file paths, handles, or pipes. The AlignBuddy tools can be broadly grouped into two classes; tools that manipulate your data and return a new alignment and tools that perform some analysis and return a non-alignment result. Each of the tools currently implemented in the command line UI have been documented in these wiki-pages, including use cases to demonstrate the tools in action. The flags chosen are hopefully rational, and care has been taken to minimize the number of positional arguments to make the learning curve as shallow as possible.

Command line sequence manipulation tools

Functions

Function	Flag	Parameters	Brief Description
alignment_lengths	-al	None	Returns a list of alignment lengths.
back_transcribe	-r2d	None	Convert RNA alignments to DNA
bootstrap	-bts	[number bootstraps (int)]	Create bootstrap alignment(s)
clean_seq	-cs	['strict'][replacement char]	Strip out ambiguous and/or non-sequence characters
concat_alignments	-cta	<regex>	Concatenates two or more alignments by splitting and matching the sequence identifiers
consensus	-con	None	Condense alignments into simple majority-rule consensus sequences
delete_records	-dr	<regex> [regex] [columns (int)]	Remove selected rows from alignments
enforce_triplets	-et	None	Shift all gaps so that all sequences are organized by triplet
extract_regions	-er	<positions (str)> [positions] ...	Pull out sub-sequences
generate_alignment	-ga	[align_tool] [args]	Create a new alignment from unaligned sequences
hash_ids	-hi	[hash length (int)]	Rename all identifiers to random hashes
list_ids	-li	[columns (int)]	Output list of all sequence identifiers
lowercase	-lc	None	Convert all sequences to lowercase
num_seqs	-ns	None	Count sequences in each alignment
order_ids	-oi	['rev']	Sort sequences by ID in alpha-numeric order (specify 'rev' to reverse order)
pull_records	-pr	<regex> [regex] ['full']	Extract selected rows from alignments
rename_ids	-ri	<regex> <subst string> [num]	Replace some pattern in IDs with something else
screw_formats	-sf	<new format>	Change the file format to something else
split_to_files	-stf	<out dir> [prefix]	Put each alignment into its own file
transcribe	-d2r	None	Convert DNA alignments to RNA
translate	-tr	None	Convert coding sequences into amino acids
trimal	-trm	[{Threshold (int or float) \| "all" \| "clean" \| "gappyout"}]	Trim away columns containing gaps based on a threshold value or one of the automatic methods (default=gappyout)
uppercase	-uc	None	Convert all sequences to uppercase

Modifying flags

Flag	Brief Description
-f --format	Force read a specific BioPython format. This may allow you to use some of the tools on some formats not auto-read by AlignBuddy, but no promises.
-i --in_place	Rewrites the FIRST input file with the final output. Be careful!
-k --keep_temp	Specify a directory to store files produced by an alignment tool (for generate_alignment)
-o --out_format	Specify the format you want the output returned in.
-q --quiet	Suppress stderr messages.
-t --test	Run the function and return any stderr/stdout other than the alignment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AlignBuddy

Command line sequence manipulation tools

Functions

Modifying flags

Main Toolkit Pages

Further Reading

Clone this wiki locally