The repository contains data and scripts for analysis of immunoglobulin (IG) and T-cell receptor (TR) loci in genomes of four great ape species (bonobo, gorilla, Sumatran orangutan, and Bornean orangutan) assembled by the Primate T2T project.
data_primate_igtrloci_locus_config.csv
: positions of IG/TR loci in genome assemblies of four great species.data_primate_igtrloci_fasta/
: FASTA files with haplotype-resolved sequences of IG/TR loci in assemblies of four great species.data_primate_gene_positions/
: positions of germline IG/TR genes across four great species.data_human_t2t_gene_positions/
: positions of germline IG/TR genes in the human T2T assembly.data_locus_alignments/
: positions of long non-overlapping alignment blocks computed across pairs of IG/TR loci.data_SV_block_positions/
: positions of structural variation blocks computed within IG/TR loci.configs/
: config files containing paths to all data files for each of the loci as well as information about locus lengths and orientations in the assembly.
- BioPython.
- pyGenomeViz, v.0.4.4.
python visualize_genome_diagram.py locus_config.txt output_fname.png
E.g.:
python visualize_genome_diagram.py configs/config_IGH.txt IGH_diagram.png
python visualize_gene_counts.py locus_config.txt output_fname.png
E.g.:
python visualize_gene_counts.py configs/config_IGH.txt IGH_gene_counts.png
Yoo D, et al. Complete sequencing of ape genomes. BioRxiv, 2024