A command line tool for bioinformatics.
Most information about proteins is stored in the UniProt database. However, a complete architecture of a protein based on protein families (domains) from the Pfam database is usually not present and to retrieve all annotated proteins (not) containing given domains is not that straightforward.
Protein Family Annotator completes exactly this task.
It downloads Pfam-A.full
and uniprot_reference_proteomes.dat
files
(if not already downloaded by the user), scans for proteins (not)
containing user-specified Pfam domains and
outputs them annotated in following format:
UniProt ID | Organism | First domain ID | First domain starting position | First domain ending position | ... | Last domain ID | Last domain starting position | Last domain ending position | Sequence |
---|---|---|---|---|---|---|---|---|---|
ABCDEF_GHIJ |
Eukaryota;Metazoa;... |
PFxxxxx |
24 |
156 |
... |
PFxxxxx |
486 |
633 |
MFHLVA...DECYWL |
KLMNOP_QRST |
Archaea;Asgardgroup;... |
PFxxxxx |
11 |
209 |
... |
PFxxxxx |
789 |
941 |
MTGIIT...QPSCAY |
Read requirements to get informed about
what is required to run pfamannot
.
Read installation to get informed about
how to install pfamannot
.
Read user documetnation to get informed about everyday usage
of pfamannot
.
Read programmer's documentation to obtain in depth
knowledge of pfamannot
's source code.
Read LICENSE to get informed about licensing, distribution and allowed usage
of pfamannot
.
Bioinformatics student at Charles University.
Ready to use...