From f0c4aadf2fe7ffee1f27004d3509499b3270f0d1 Mon Sep 17 00:00:00 2001 From: Cheng Yang Date: Wed, 28 Aug 2024 00:21:22 +0800 Subject: [PATCH] Update README.md --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 4f0798f..1020045 100644 --- a/README.md +++ b/README.md @@ -53,13 +53,16 @@ Awesome-LLMs-meet-genomes is a collection of state-of-the-art, novel, exciting L #### General | Year | Title | Venue | Paper | Code | | ---- | ------------------------------------------------------------ | :-----: | :----------------------------------------------------------: | :----------------------------------------------------------: | +| 2024.07 | **OmniGenome: Aligning RNA Sequences with Secondary Structures in Genomic Foundation Models** | arXiv | [link](https://arxiv.org/abs/2407.11242) | [link](https://github.com/yangheng95/OmniGenomeBench) | | 2024.07 | **Scorpio : Enhancing Embeddings to Improve Downstream Analysis of DNA sequences** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2024.07.19.604359v1.abstract) | [link](https://github.com/EESI/Scorpio) | | 2024.07 | **DNA language model GROVER learns sequence context in the human genome (可用于蛋白质-DNA结合预测任务)** | Nature Machine Intelligence | [link](https://doi.org/10.1038/s42256-024-00872-0) | [link](https://doi.org/10.5281/zenodo.8373202) [tutorials](https://doi.org/10.5281/zenodo.8373158) | +| 2024.05 | **Are Genomic Language Models All You Need? Exploring Genomic Language Models on Protein Downstream Tasks** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2024.05.20.594989v1) | [link](https://huggingface.co/InstaDeepAI/nucleotide-transformer-v2-50m-3mer-multi-species) | | 2024.04 | **DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genome** | ICLR'24 | [link](https://openreview.net/pdf?id=oMLQB4EZE1) | [link](https://github.com/MAGICS-LAB/DNABERT_2) | | 2024.04 | **Species-aware DNA language models capture regulatory elements and their evolution** | Genome Biology | [link](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03221-x) | [link](https://github.com/gagneurlab/SpeciesLM) | | 2024.02 | **GenomicLLM: Exploring Genomic Large Language Models: Bridging the Gap between Natural Language and Gene Sequences** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2024.02.26.581496v1) | - | | 2024.02 | **Sequence modeling and design from molecular to genome scale with Evo** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2024.02.27.582234v1) | [link](https://github.com/evo-design/evo) | | 2024.01 | **ProkBERT family: genomic language models for microbiome applications** | Frontiers in Microbiology | [Link](https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2023.1331233/full) | [link](https://github.com/nbrgppcu/prokbert) | +| 2023.09 | **The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2023.01.11.523679v3) | [link](https://github.com/instadeepai/nucleotide-transformer) | | 2023.08 | **DNAGPT: A Generalized Pre-trained Tool for Versatile DNA Sequence Analysis Tasks** | bioRxiv | [link](https://www.bioRxiv.org/content/10.1101/2023.07.11.548628v2) | [link](https://github.com/TencentAILabHealthcare/DNAGPT) | | 2023.08 | **Understanding the Natural Language of DNA using Encoder-Decoder Foundation Models with Byte-level Precision** | arxiv | [link](https://arxiv.org/abs/2311.02333) | [link](https://github.itap.purdue.edu/Clan-labs/ENBED) | | 2023.07 | **EpiGePT: a Pretrained Transformer model for epigenomics** | bioRxiv | [link](https://www.biorxiv.org/content/10.1101/2023.07.15.549134v2) | [link](https://github.com/ZjGaothu/EpiGePT) |