Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makeGenesDataFromTxDb did not pick up gene in some region #152

Open
h20gg702 opened this issue May 15, 2024 · 2 comments
Open

makeGenesDataFromTxDb did not pick up gene in some region #152

h20gg702 opened this issue May 15, 2024 · 2 comments
Assignees
Labels

Comments

@h20gg702
Copy link

Hi Thank you for developing nice tools. I faced a problem in the makeGenesDataFromTxDb function. I found that plotKaryotype does not work for some region.

grW <- toGRanges("chr17:1715523-1739600") kp <- plotKaryotype(zoom = grW, cex=1, plot.type=2) genes.data <- makeGenesDataFromTxDb(TxDb.Hsapiens.UCSC.hg38.knownGene, karyoplot=kp, plot.transcripts = TRUE, plot.transcripts.structure = TRUE)
But I couldn't see any genes just character(0) like below.
genes.data[["genes"]]@ranges@NAMES
character(0)

But when I used trackViewer package, I can see there is WDR81 gene in the region I indicated in "plotKaryotype". So TxDb.Hsapiens.UCSC.hg38.knownGene package is ok. Do you have any idea for this problem?

genes <- geneTrack("124997", TxDb.Hsapiens.UCSC.hg38.knownGene, "WDR81", asList=FALSE)
genes@dat@ranges
IRanges object with 47 ranges and 0 metadata columns:
start end width

124997.WDR81 1716523 1716535 13
124997.WDR81 1716536 1716546 11
124997.WDR81 1716547 1716571 25
124997.WDR81 1716572 1716575 4
124997.WDR81 1716576 1716600 25
... ... ... ...
124997.WDR81 1737686 1738488 803
124997.WDR81 1738489 1738584 96
124997.WDR81 1738585 1738585 1
124997.WDR81 1738586 1738594 9
124997.WDR81 1738595 1738599 5

@GRealesM
Copy link

GRealesM commented May 29, 2024

Hi Bernat and maintainers,

I was about to open an issue with a similar problem. I'll post it here, hoping it helps.
In my case, I'm trying to plot a region at the beginning of chr11 using this dataset. I noticed that important genes like IRF7 were missing. After looking at the UCSC browser, I realised the missing region corresponds perfectly to the bit where "chr11_KI270832v1_alt" is annotated. From this 8-year-old question in Bioconductor, I realised that the problem comes when trying to filter things that appear in more than one place.

I tried following their advice and use keepStandardChromosomes(TxDb.Hsapiens.UCSC.hg38.knownGene), and the final object recovers more genes, including the ones I missed. The problem is that the code to get the transcripts (eg. transcriptsBy()) doesn't recover the transcripts for the previously missing genes, which makes running their code to fail.
I tried to simply apply this to the code suggested in the tutorial (below) but it also fails.
tx <- keepStandardChromosomes(TxDb.Hsapiens.UCSC.hg38.knownGene)
genes.data <- makeGenesDataFromTxDb(txdb = tx, karyoplot = kp)
genes.data <- addGeneNames(genes.data)
genes.data <- mergeTranscripts(genes.data).

Maybe this has a very easy solution, like setting a specific parameter, but I haven't found it yet.
Again, any advice is appreciated.

Guillermo

======================================
`sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Rocky Linux 8.9 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblaso-r0.3.15.so; LAPACK version 3.9.0

locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

time zone: GB
tzcode source: system (glibc)

attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base

other attached packages:
[1] org.Hs.eg.db_3.18.0 TxDb.Hsapiens.UCSC.hg38.knownGene_3.18.0 GenomicFeatures_1.54.4 AnnotationDbi_1.64.1
[5] Biobase_2.62.0 karyoploteR_1.28.0 regioneR_1.34.0 GenomicRanges_1.54.1
[9] GenomeInfoDb_1.38.8 IRanges_2.36.0 S4Vectors_0.40.2 BiocGenerics_0.48.1
[13] magrittr_2.0.3 data.table_1.15.4

loaded via a namespace (and not attached):
[1] DBI_1.2.2 bitops_1.0-7 gridExtra_2.3 biomaRt_2.58.2 rlang_1.1.3
[6] biovizBase_1.50.0 matrixStats_1.3.0 compiler_4.3.3 RSQLite_2.3.6 png_0.1-8
[11] vctrs_0.6.5 ProtGenerics_1.34.0 stringr_1.5.1 pkgconfig_2.0.3 crayon_1.5.2
[16] fastmap_1.2.0 backports_1.5.0 dbplyr_2.5.0 XVector_0.42.0 utf8_1.2.4
[21] Rsamtools_2.18.0 rmarkdown_2.27 bit_4.0.5 xfun_0.44 zlibbioc_1.48.2
[26] cachem_1.1.0 jsonlite_1.8.8 progress_1.2.3 blob_1.2.4 DelayedArray_0.28.0
[31] BiocParallel_1.36.0 parallel_4.3.3 prettyunits_1.2.0 cluster_2.1.6 R6_2.5.1
[36] VariantAnnotation_1.48.1 stringi_1.8.4 RColorBrewer_1.1-3 bezier_1.1.2 rtracklayer_1.62.0
[41] rpart_4.1.23 knitr_1.46 Rcpp_1.0.12 SummarizedExperiment_1.32.0 R.utils_2.12.3
[46] base64enc_0.1-3 Matrix_1.6-5 nnet_7.3-19 tidyselect_1.2.1 dichromat_2.0-0.1
[51] rstudioapi_0.16.0 abind_1.4-5 yaml_2.3.8 codetools_0.2-20 curl_5.2.1
[56] lattice_0.22-6 tibble_3.2.1 KEGGREST_1.42.0 evaluate_0.23 foreign_0.8-86
[61] BiocFileCache_2.10.2 xml2_1.3.6 Biostrings_2.70.3 pillar_1.9.0 filelock_1.0.3
[66] MatrixGenerics_1.14.0 checkmate_2.3.1 generics_0.1.3 RCurl_1.98-1.14 ensembldb_2.26.0
[71] hms_1.1.3 ggplot2_3.5.1 munsell_0.5.1 scales_1.3.0 glue_1.7.0
[76] lazyeval_0.2.2 Hmisc_5.1-2 tools_4.3.3 BiocIO_1.12.0 BSgenome_1.70.2
[81] GenomicAlignments_1.38.2 XML_3.99-0.16.1 grid_4.3.3 colorspace_2.1-0 GenomeInfoDbData_1.2.11
[86] htmlTable_2.4.2 restfulr_0.0.15 Formula_1.2-5 cli_3.6.2 rappdirs_0.3.3
[91] fansi_1.0.6 S4Arrays_1.2.1 dplyr_1.1.4 AnnotationFilter_1.26.0 gtable_0.3.5
[96] R.methodsS3_1.8.2 digest_0.6.35 SparseArray_1.2.4 rjson_0.2.21 htmlwidgets_1.6.4
[101] R.oo_1.26.0 memoise_2.0.1 htmltools_0.5.8.1 lifecycle_1.0.4 httr_1.4.7
[106] bit64_4.0.5 bamsignals_1.34.0 `

@bernatgel bernatgel self-assigned this May 29, 2024
@bernatgel bernatgel added the bug label May 29, 2024
@bernatgel
Copy link
Owner

Hi @h20gg702 and Guillermo @GRealesM

Thanks for pointing this out and for the additional information provided by Guillermo.

It seems like a bug in karyoploteR, so I'll have to take a look at it.

I'll get back to you as soon as I have some more info

Bernat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants