Scripts created for the micro-QTL project.
Converts the GO flat file to a pickled format optimised for tree traversal.
$ ./create_go_tree.py --output go_tree.pickle.xz go-basic.obo
Creates ID conversion maps from the Uniprot idmapping file.
# NCBI gene ID to GO-terms lookup table
$ ./make_uniprot_idmapping_db.py --from geneid --to go idmapping_selected.tab.gz output.pickle.xz
For converting the peak regions obtained by R/qtl from centimorgans to actual coordinates.
$ ./morgans_to_coordinates.py marker_new_map.csv rqtl_peaks.csv > coordinates_output.csv
The R/qtl peaks file (rqtl_peaks.csv
) should be formatted as tab-separated CSV:
peak_id | chr | pos_cm |
---|---|---|
peak_1 | 2 | 6.052 |
peak_2 | 8 | 52.64 |
Find the GO terms within the given regions using a GFF as a reference (e.g. SL2.40).
$ ./get_go_in_region.py --gff sl240.gff.gz --map output.pickle.xz coordinates_output.csv > peak_terms_output.csv
output.pickle.xz
is generated by make_uniprot_idmapping_db.py
.
coordinates_output.csv
is generated by morgans_to_coordinates.py
.
Perform the GO term enrichment analysis.
$ ./go_enrichment.R peak_terms_output.csv
peak_terms_output.csv
is generated by get_go_in_region.py
.