-
Notifications
You must be signed in to change notification settings - Fork 23
SB Annotate
Add feature annotations to sequences. The GenBank and EMBL flat file specifications provide for rich annotation, so the default output format from this tool is genbank; this can be overridden with the -o flag.
The first two arguments are required and their order matters. The remaining arguments are optional, with annotate
automatically detecting the order in which they are being passed in.
Any feature type (or 'key') can be specified, but only 16 characters will be printed to GenBank/EMBL format. Furthermore, to comply with the strict GenBank/EMBL specification, you must select from a set of approved feature keys. A warning will be printed if choosing keys outside of the specification, which can be silenced with the -q flag.
The start and end positions of the new feature should be passed in as a single string in the format 'start-end'. If the feature is compound (e.g., the coding sequence within a whole gene), multiple locations can be combined into a single string in the format 'start1-end1,start2-end2,start3-end3,...'.
Optional. If working with DNA, you can specify which strand the feature is on with the '+' (sense) or '-' (anti-sense) characters.
Optional. Qualifiers are additional information or sub-features. While qualifiers are completely free-form in SeqBuddy, each key in the GenBank/EMBL specification has a set group of approved qualifiers. Represent your qualifiers in the form 'qualifier_name=information', and there is no restriction on how many are specified.
Optional. Specify which sequence(s) the new feature should be applied to. The pull_recs function is used to get the subset of sequences that will be affected, and regular expressions are understood. Multiple regular expressions can be passed in if desired.
LOCUS Mle-Panxα4 1275 bp DNA UNA 02-JAN-2015
DEFINITION cDNA and genomic - ML129317a.
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE
ORGANISM . . .
.
FEATURES Location/Qualifiers
TMD1 82..144
TMD2 391..453
TMD3 643..705
TMD4 913..1005
ORIGIN
1 atggttattg agctgctagc tggatacaaa ggtctgtccc cgtttaaaga cgcgactgtt
61 gacgactcat gggaccaaat aaaccgatgt tacgtgttca tcgccatggt ggtgatgggt
121 gctgtgacta caatgaggca atactctgga acattgattg catgtgacgg gttcacgaag
181 ttccaccctc agtttgcaga agattactgc tggagcatag gaatgtacac ggtacgcgag
241 gcctatgact tgcccagcag tatggttgca taccccggag tgataccctg ggatatgcct
301 gcatgtgttc cacgtctcct gaagaacgga accaggacca aatgtggcag tgagaaggac
361 gttatgccct cagagaaaat ctaccacttg tggtaccagt gggcaagttt ctacttctgg
421 atagtggcta tactgtacta cgcgccgtat ataatgttca aacagttggg agggggagag
481 tacaagcccc tgatcaagct actttgtctt gcgtctggat ctcctgaaca acagatgcag
541 gacatccagg agcgtgtcgt caagtggctt ttcttcaggt ttaagaccta catattcgct
601 aagggttact acgcgtggct acgtaaaaac agtttcagta tcgctatcgg cgtgacaaaa
661 ttgtcctatc tcctgataac tatccttgtg ttctacttaa caggcttcat gttcgaatat
721 ggctctaaca cgtggtaccg gtacggtgct gactggtacg gtaccagatt ctcctcgtac
781 cacgaaacta acaactcaat cacactcaca aaggacatca tcttcccaaa gatggtagcg
841 tgtgagatca agcgatgggg tccctcaggg attgaggttg agaccgctca gtgcgtactt
901 gccccgaatg tgctctacca gtaccttttc ctctttactt ggtacctcct gatcgcggta
961 ttcttcacta acctcatcag ttgtttcctc cacatttctg agatgttctt ctctaacggt
1021 acgtacaaca ggatgataga tcaaggaatg ttgccagaca agcccagtta tcggtacgtc
1081 ttcatgaaca ttggcgccgg tggcagagag atagtccaga ttctaacaga caattccaac
1141 cccctcttgt ttagcaagat atttgacgat cttaccaatt tactaatcac tacttccaaa
1201 aacgctgacg tcattgaaaa cctgtcgaag ttggattcct ccgtaattga actaggcagc
1261 aaagactcaa tctaa
//
$: sb Mle-Panxα4_cds.gb -ano 'misc_feature' '1-10'
LOCUS Mle-Panxα4 1275 bp DNA UNA 02-JAN-2015
DEFINITION cDNA and genomic - ML129317a.
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
misc_feature 1..10
TMD1 82..144
TMD2 391..453
TMD3 643..705
TMD4 913..1005
ORIGIN
1 atggttattg agctgctagc tggatacaaa ggtctgtccc cgtttaaaga cgcgactgtt
61 gacgactcat gggaccaaat aaaccgatgt tacgtgttca tcgccatggt ggtgatgggt
121 gctgtgacta caatgaggca atactctgga acattgattg catgtgacgg gttcacgaag
181 ttccaccctc agtttgcaga agattactgc tggagcatag gaatgtacac ggtacgcgag
241 gcctatgact tgcccagcag tatggttgca taccccggag tgataccctg ggatatgcct
301 gcatgtgttc cacgtctcct gaagaacgga accaggacca aatgtggcag tgagaaggac
361 gttatgccct cagagaaaat ctaccacttg tggtaccagt gggcaagttt ctacttctgg
421 atagtggcta tactgtacta cgcgccgtat ataatgttca aacagttggg agggggagag
481 tacaagcccc tgatcaagct actttgtctt gcgtctggat ctcctgaaca acagatgcag
541 gacatccagg agcgtgtcgt caagtggctt ttcttcaggt ttaagaccta catattcgct
601 aagggttact acgcgtggct acgtaaaaac agtttcagta tcgctatcgg cgtgacaaaa
661 ttgtcctatc tcctgataac tatccttgtg ttctacttaa caggcttcat gttcgaatat
721 ggctctaaca cgtggtaccg gtacggtgct gactggtacg gtaccagatt ctcctcgtac
781 cacgaaacta acaactcaat cacactcaca aaggacatca tcttcccaaa gatggtagcg
841 tgtgagatca agcgatgggg tccctcaggg attgaggttg agaccgctca gtgcgtactt
901 gccccgaatg tgctctacca gtaccttttc ctctttactt ggtacctcct gatcgcggta
961 ttcttcacta acctcatcag ttgtttcctc cacatttctg agatgttctt ctctaacggt
1021 acgtacaaca ggatgataga tcaaggaatg ttgccagaca agcccagtta tcggtacgtc
1081 ttcatgaaca ttggcgccgg tggcagagag atagtccaga ttctaacaga caattccaac
1141 cccctcttgt ttagcaagat atttgacgat cttaccaatt tactaatcac tacttccaaa
1201 aacgctgacg tcattgaaaa cctgtcgaag ttggattcct ccgtaattga actaggcagc
1261 aaagactcaa tctaa
//
$: sb Mle-Panxα4_cds.gb -ano 'misc_feature' '1-10,20-30' - 'foo=bar' 'hello=world'
LOCUS Mle-Panxα4 1275 bp DNA UNA 02-JAN-2015
DEFINITION cDNA and genomic - ML129317a.
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE
ORGANISM .
.
FEATURES Location/Qualifiers
misc_feature complement(order(20..30,1..10))
/foo="bar"
/hello="world"
TMD1 82..144
TMD2 391..453
TMD3 643..705
TMD4 913..1005
ORIGIN
1 atggttattg agctgctagc tggatacaaa ggtctgtccc cgtttaaaga cgcgactgtt
61 gacgactcat gggaccaaat aaaccgatgt tacgtgttca tcgccatggt ggtgatgggt
121 gctgtgacta caatgaggca atactctgga acattgattg catgtgacgg gttcacgaag
181 ttccaccctc agtttgcaga agattactgc tggagcatag gaatgtacac ggtacgcgag
241 gcctatgact tgcccagcag tatggttgca taccccggag tgataccctg ggatatgcct
301 gcatgtgttc cacgtctcct gaagaacgga accaggacca aatgtggcag tgagaaggac
361 gttatgccct cagagaaaat ctaccacttg tggtaccagt gggcaagttt ctacttctgg
421 atagtggcta tactgtacta cgcgccgtat ataatgttca aacagttggg agggggagag
481 tacaagcccc tgatcaagct actttgtctt gcgtctggat ctcctgaaca acagatgcag
541 gacatccagg agcgtgtcgt caagtggctt ttcttcaggt ttaagaccta catattcgct
601 aagggttact acgcgtggct acgtaaaaac agtttcagta tcgctatcgg cgtgacaaaa
661 ttgtcctatc tcctgataac tatccttgtg ttctacttaa caggcttcat gttcgaatat
721 ggctctaaca cgtggtaccg gtacggtgct gactggtacg gtaccagatt ctcctcgtac
781 cacgaaacta acaactcaat cacactcaca aaggacatca tcttcccaaa gatggtagcg
841 tgtgagatca agcgatgggg tccctcaggg attgaggttg agaccgctca gtgcgtactt
901 gccccgaatg tgctctacca gtaccttttc ctctttactt ggtacctcct gatcgcggta
961 ttcttcacta acctcatcag ttgtttcctc cacatttctg agatgttctt ctctaacggt
1021 acgtacaaca ggatgataga tcaaggaatg ttgccagaca agcccagtta tcggtacgtc
1081 ttcatgaaca ttggcgccgg tggcagagag atagtccaga ttctaacaga caattccaac
1141 cccctcttgt ttagcaagat atttgacgat cttaccaatt tactaatcac tacttccaaa
1201 aacgctgacg tcattgaaaa cctgtcgaag ttggattcct ccgtaattga actaggcagc
1261 aaagactcaa tctaa
//
>Mle-Panxα1 cDNA - ML078817.
mywifeicqeikraqscrkfaidgpfdwtnriimptlmviccflqtftfmfgsniscigf
eklernfveeycwtqgiytskaaynmplhtpypgiapcvpeydpvtqkywlpcgveeedk
ayhlwyqwvpfyflavavgyylpflilkgsklhqvkplitylmnqrnletdpnhlvgkls
hwifrqlvysrfaatstirmywhdwglvllvcsvkilyltvslihlfatakmfhignwft
ygimfarrsnshtthvkdvffpkmvackietwsftgknhlhgmcvlalnvmnqylflivw
yvnviiiflnsisciytivkfcspnivhhrivnssslddhhdftrmfgyvgpsgriilak
msehmpgymlkqvakkvtekidieneknrgraptikftkvngqpselarqplmhlnalml
gmvpqnlpepkiqniqrsqkkvrflv*
>Mle-Panxα4 cDNA and genomic - ML129317a.
mviellagykglspfkdatvddswdqinrcyvfiamvvmgavttmrqysgtliacdgftk
fhpqfaedycwsigmytvreaydlpssmvaypgvipwdmpacvprllkngtrtkcgsekd
vmpsekiyhlwyqwasfyfwivailyyapyimfkqlgggeykplikllclasgspeqqmq
diqervvkwlffrfktyifakgyyawlrknsfsiaigvtklsyllitilvfyltgfmfey
gsntwyrygadwygtrfssyhetnnsitltkdiifpkmvaceikrwgpsgievetaqcvl
apnvlyqylflftwylliavfftnliscflhisemffsngtynrmidqgmlpdkpsyryv
fmnigaggreivqiltdnsnpllfskifddltnllittsknadvienlskldssvielgs
kdsi*
>Mle-Panxα12 cDNA - ML25997a.
mvidilsgfkgitpfkgitlddgwdqinrsfmfvlcvlmgtvvtvrqyaggiiscdgftk
ysgsfsedycwtqglytikeaydlltmnvpypgvipedmptcierelinggrvscpdpet
vkpptrvyhlwyqwvpfyfwlaaaafffpyliykhfgvgdlkpliqmlhnpivdegdqnc
maekasmwlfyklnvfmnentifailtekhrlffivmlvkvlyliisilalyltdemfhi
gsfvsygsewatslpegdnettlvkdklfpkmvaceikrwgptgleeeqgmcvlapnvin
qylflilwfaiifciacnclsvlfaltklvfvlgsykrllasaflkdelhykhmffnigt
sgrvllqivatnvsprvfesimanlatkliaerlkgngkgsv*
$: sb Mle-Panx_pep.fa -ano 'misc_feature' '1-10,20-30' - 'foo=bar' 'hello=world' 'Panxα4'
LOCUS Mle-Panxα1 447 aa UNK 01-JAN-1980
DEFINITION Mle-Panxα1 cDNA - ML078817.
ACCESSION Mle-Panxα1
VERSION Mle-Panxα1
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
ORIGIN
1 mywifeicqe ikraqscrkf aidgpfdwtn riimptlmvi ccflqtftfm fgsniscigf
61 eklernfvee ycwtqgiyts kaaynmplht pypgiapcvp eydpvtqkyw lpcgveeedk
121 ayhlwyqwvp fyflavavgy ylpflilkgs klhqvkplit ylmnqrnlet dpnhlvgkls
181 hwifrqlvys rfaatstirm ywhdwglvll vcsvkilylt vslihlfata kmfhignwft
241 ygimfarrsn shtthvkdvf fpkmvackie twsftgknhl hgmcvlalnv mnqylflivw
301 yvnviiifln sisciytivk fcspnivhhr ivnssslddh hdftrmfgyv gpsgriilak
361 msehmpgyml kqvakkvtek idieneknrg raptikftkv ngqpselarq plmhlnalml
421 gmvpqnlpep kiqniqrsqk kvrflv*
//
LOCUS Mle-Panxα12 403 aa UNK 01-JAN-1980
DEFINITION Mle-Panxα12 cDNA - ML25997a.
ACCESSION Mle-Panxα12
VERSION Mle-Panxα12
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
ORIGIN
1 mvidilsgfk gitpfkgitl ddgwdqinrs fmfvlcvlmg tvvtvrqyag giiscdgftk
61 ysgsfsedyc wtqglytike aydlltmnvp ypgvipedmp tciereling grvscpdpet
121 vkpptrvyhl wyqwvpfyfw laaaafffpy liykhfgvgd lkpliqmlhn pivdegdqnc
181 maekasmwlf yklnvfmnen tifailtekh rlffivmlvk vlyliisila lyltdemfhi
241 gsfvsygsew atslpegdne ttlvkdklfp kmvaceikrw gptgleeeqg mcvlapnvin
301 qylflilwfa iifciacncl svlfaltklv fvlgsykrll asaflkdelh ykhmffnigt
361 sgrvllqiva tnvsprvfes imanlatkli aerlkgngkg sv*
//
LOCUS Mle-Panxα4 425 aa UNK 01-JAN-1980
DEFINITION Mle-Panxα4 cDNA and genomic - ML129317a.
ACCESSION Mle-Panxα4
VERSION Mle-Panxα4
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
misc_feature order(1..10,20..30)
/foo="bar"
/hello="world"
ORIGIN
1 mviellagyk glspfkdatv ddswdqinrc yvfiamvvmg avttmrqysg tliacdgftk
61 fhpqfaedyc wsigmytvre aydlpssmva ypgvipwdmp acvprllkng trtkcgsekd
121 vmpsekiyhl wyqwasfyfw ivailyyapy imfkqlggge ykplikllcl asgspeqqmq
181 diqervvkwl ffrfktyifa kgyyawlrkn sfsiaigvtk lsyllitilv fyltgfmfey
241 gsntwyryga dwygtrfssy hetnnsitlt kdiifpkmva ceikrwgpsg ievetaqcvl
301 apnvlyqylf lftwylliav fftnliscfl hisemffsng tynrmidqgm lpdkpsyryv
361 fmnigaggre ivqiltdnsn pllfskifdd ltnllittsk nadvienlsk ldssvielgs
421 kdsi*
//