Skip to content

SB Map features nucl2prot

Steve Bond edited this page Jul 30, 2015 · 6 revisions

--map_features_dna2prot, -fd2p

Description

Transfer nucleotide feature annotations onto their corresponding protein sequences. The nucleotide and amino acid files must be separate.

SeqBuddy will throw a warning if it finds sequences in either file that are not also present in the other, but this can be silenced with the -q flag.

The default output format is genbank, but this can be overridden with the -o flag.

Examples

Example 1

input file 1: Mle-Panxα4_cds.gb

LOCUS       Mle-Panxα4              1275 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA and genomic - ML129317a.
ACCESSION   Mle-Panxα4
VERSION     Mle-Panxα4
KEYWORDS    .
SOURCE
  ORGANISM  . . .
            .
FEATURES             Location/Qualifiers
     TMD1            82..144
     TMD2            391..453
     TMD3            643..705
     TMD4            913..1005
ORIGIN
        1 atggttattg agctgctagc tggatacaaa ggtctgtccc cgtttaaaga cgcgactgtt
       61 gacgactcat gggaccaaat aaaccgatgt tacgtgttca tcgccatggt ggtgatgggt
      121 gctgtgacta caatgaggca atactctgga acattgattg catgtgacgg gttcacgaag
      181 ttccaccctc agtttgcaga agattactgc tggagcatag gaatgtacac ggtacgcgag
      241 gcctatgact tgcccagcag tatggttgca taccccggag tgataccctg ggatatgcct
      301 gcatgtgttc cacgtctcct gaagaacgga accaggacca aatgtggcag tgagaaggac
      361 gttatgccct cagagaaaat ctaccacttg tggtaccagt gggcaagttt ctacttctgg
      421 atagtggcta tactgtacta cgcgccgtat ataatgttca aacagttggg agggggagag
      481 tacaagcccc tgatcaagct actttgtctt gcgtctggat ctcctgaaca acagatgcag
      541 gacatccagg agcgtgtcgt caagtggctt ttcttcaggt ttaagaccta catattcgct
      601 aagggttact acgcgtggct acgtaaaaac agtttcagta tcgctatcgg cgtgacaaaa
      661 ttgtcctatc tcctgataac tatccttgtg ttctacttaa caggcttcat gttcgaatat
      721 ggctctaaca cgtggtaccg gtacggtgct gactggtacg gtaccagatt ctcctcgtac
      781 cacgaaacta acaactcaat cacactcaca aaggacatca tcttcccaaa gatggtagcg
      841 tgtgagatca agcgatgggg tccctcaggg attgaggttg agaccgctca gtgcgtactt
      901 gccccgaatg tgctctacca gtaccttttc ctctttactt ggtacctcct gatcgcggta
      961 ttcttcacta acctcatcag ttgtttcctc cacatttctg agatgttctt ctctaacggt
     1021 acgtacaaca ggatgataga tcaaggaatg ttgccagaca agcccagtta tcggtacgtc
     1081 ttcatgaaca ttggcgccgg tggcagagag atagtccaga ttctaacaga caattccaac
     1141 cccctcttgt ttagcaagat atttgacgat cttaccaatt tactaatcac tacttccaaa
     1201 aacgctgacg tcattgaaaa cctgtcgaag ttggattcct ccgtaattga actaggcagc
     1261 aaagactcaa tctaa
//

input file 2: Mle-Panxα4_pep.fa

>Mle-Panxα4 ML129317a.
MVIELLAGYKGLSPFKDATVDDSWDQINRCYVFIAMVVMGAVTTMRQYSGTLIACDGFTK
FHPQFAEDYCWSIGMYTVREAYDLPSSMVAYPGVIPWDMPACVPRLLKNGTRTKCGSEKD
VMPSEKIYHLWYQWASFYFWIVAILYYAPYIMFKQLGGGEYKPLIKLLCLASGSPEQQMQ
DIQERVVKWLFFRFKTYIFAKGYYAWLRKNSFSIAIGVTKLSYLLITILVFYLTGFMFEY
GSNTWYRYGADWYGTRFSSYHETNNSITLTKDIIFPKMVACEIKRWGPSGIEVETAQCVL
APNVLYQYLFLFTWYLLIAVFFTNLISCFLHISEMFFSNGTYNRMIDQGMLPDKPSYRYV
FMNIGAGGREIVQILTDNSNPLLFSKIFDDLTNLLITTSKNADVIENLSKLDSSVIELGS
KDSI*

usage

$: sb Mle-Panxα4_cds.gb Mle-Panxα4_pep.fa -fd2p

output

LOCUS       Mle-Panxα4               425 aa                     UNK 01-JAN-1980
DEFINITION  Mle-Panxα4 cDNA and genomic - ML129317a.
ACCESSION   Mle-Panxα4
VERSION     Mle-Panxα4
KEYWORDS    .
SOURCE      .
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     TMD1            28..48
     TMD2            131..151
     TMD3            215..235
     TMD4            305..335
ORIGIN
        1 mviellagyk glspfkdatv ddswdqinrc yvfiamvvmg avttmrqysg tliacdgftk
       61 fhpqfaedyc wsigmytvre aydlpssmva ypgvipwdmp acvprllkng trtkcgsekd
      121 vmpsekiyhl wyqwasfyfw ivailyyapy imfkqlggge ykplikllcl asgspeqqmq
      181 diqervvkwl ffrfktyifa kgyyawlrkn sfsiaigvtk lsyllitilv fyltgfmfey
      241 gsntwyryga dwygtrfssy hetnnsitlt kdiifpkmva ceikrwgpsg ievetaqcvl
      301 apnvlyqylf lftwylliav fftnliscfl hisemffsng tynrmidqgm lpdkpsyryv
      361 fmnigaggre ivqiltdnsn pllfskifdd ltnllittsk nadvienlsk ldssvielgs
      421 kdsi*
//

Example 2

input file 1: N-terminal_cds.gb

LOCUS       Mle-Panxα4               200 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA and genomic - ML129317a.
ACCESSION   Mle-Panxα4
VERSION     Mle-Panxα4
KEYWORDS    .
SOURCE
  ORGANISM  . . . .
            .
FEATURES             Location/Qualifiers
     N-Term          1..81
     TMD1            82..144
     ECL1            145..200
ORIGIN
        1 atggttattg agctgctagc tggatacaaa ggtctgtccc cgtttaaaga cgcgactgtt
       61 gacgactcat gggaccaaat aaaccgatgt tacgtgttca tcgccatggt ggtgatgggt
      121 gctgtgacta caatgaggca atactctgga acattgattg catgtgacgg gttcacgaag
      181 ttccaccctc agtttgcaga
//
LOCUS       Mle-Panxα3               200 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML036514a.
ACCESSION   Mle-Panxα3
VERSION     Mle-Panxα3
KEYWORDS    .
SOURCE
  ORGANISM  . . . .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..151,152..200)
                     /label="ML036514a"
     N-term          1..84
     TMD1            85..147
     ECL1            148..200
ORIGIN
        1 atgttgttgc tcggctcact cggaacgatc aagaacttga gcatcttcaa agacctgtcc
       61 ttggacgact ggctggatca gatgaacagg accttcatgt ttctactgct ctgtttcatg
      121 ggaacaattg tcgccgttag tcagtacact ggtaaaaaca tatcttgcga tggctttacg
      181 aagttcggag aagatttctc
//

input file 2: Mle-Panxα3.fa

>Mle-Panxα3 ML036514a.
MLLLGSLGTIKNLSIFKDLSLDDWLDQMNRTFMFLLLCFMGTIVAVSQYTGKNISCDGFT
KFGEDFSQDYCWTQGLYTIKEAYDLPESQIPYPGIIPENVPACREHALKNGGKIVCPPED
QVKPLTRARHLWYQWIPFYFWVIAPVFYLPYMFVKRMGLDRMKPLLKIMSDYYHCTTETP
SEEIIVKCADWVYNSIVDRLSEGSSWTSWRNRHGLGLAVLVSKFMYLGGSVLVMMMTTLM
FQVGDFKTYGIEWLRQFPNPENYSTSVKHKLFPKMVACEIKRWGTTGLEEENGMCVLAPN
VIYQYIFLIMWFALAITICTNFGNIFFYLFKLTATRYTYNKLVATGHFSHKHPGWKFMYY
RIGTSGRVLLNIVAQNTNPIIFGAIMEKLTPSVIKHLRIGHVPGEYLTDPA*

usage

$: sb Mle-Panxα3.fa N-terminal_cds.gb -fd2p

output

Warning: Mle-Panxα4 is in the cDNA file, but not in the protein file

LOCUS       Mle-Panxα3                66 aa                     UNK 01-JAN-1980
DEFINITION  Mle-Panxα3 cDNA - ML036514a.
ACCESSION   Mle-Panxα3
VERSION     Mle-Panxα3
KEYWORDS    .
SOURCE      .
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..50,51..66)
                     /label="ML036514a"
     N-term          1..28
     TMD1            29..49
     ECL1            50..66
ORIGIN
        1 mlllgslgti knlsifkdls lddwldqmnr tfmflllcfm gtivavsqyt gkniscdgft
       61 kfgedfsqdy cwtqglytik eaydlpesqi pypgiipenv pacrehalkn ggkivcpped
      121 qvkpltrarh lwyqwipfyf wviapvfylp ymfvkrmgld rmkpllkims dyyhcttetp
      181 seeiivkcad wvynsivdrl segsswtswr nrhglglavl vskfmylggs vlvmmmttlm
      241 fqvgdfktyg iewlrqfpnp enystsvkhk lfpkmvacei krwgttglee engmcvlapn
      301 viyqyiflim wfalaitict nfgniffylf kltatrytyn klvatghfsh khpgwkfmyy
      361 rigtsgrvll nivaqntnpi ifgaimeklt psvikhlrig hvpgeyltdp a*
//

Main Toolkit Pages





Further Reading

Clone this wiki locally