Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GTFs in TnSeq analysis #9

Open
jsa-aerial opened this issue May 16, 2018 · 0 comments
Open

Use GTFs in TnSeq analysis #9

jsa-aerial opened this issue May 16, 2018 · 0 comments

Comments

@jsa-aerial
Copy link
Owner

calc_fitness and aggregate both use gbk parsing to determine annotations but this has two problems:

  1. It introduces dependency on bioperl and biopython - nothing else in them uses this
  2. neither of them properly parse gbks with multiple locus entries - say for whole genome and some associated plasmids

Using GTFs:

  • eliminates these dependencies - making installation simpler
  • simplifies the 'parse' - basically it is just csv read and pick fields
  • easy to create GTFs with multiple locus entries (the 'chromosome' field) from multiple gbks
  • gbks can be kept simple - single locus per gbk
  • runs involving a strain with whole genome and associated plasmids become simple to accommodate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant