Skip to content

Commit

Permalink
Merge pull request #14 from GoekeLab/update_links
Browse files Browse the repository at this point in the history
update download links
  • Loading branch information
cying111 authored Oct 18, 2021
2 parents 0039f2e + ce3fe61 commit 610dc24
Show file tree
Hide file tree
Showing 4 changed files with 161 additions and 97 deletions.
20 changes: 2 additions & 18 deletions DATA.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,5 @@
### Datasets

---
#### Update (15-07-2021)
Download links are currently unavailable, we work on restoring them as soon as possible. In the meantime, the unprocessed data (fastq) can be downloaded from ENA: https://www.ebi.ac.uk/ena/browser/view/PRJEB44348
The current data release consists of 93 files that include long read and short read RNA-Seq data from all 5 cell lines. The sample description and download links can be found [here](docs/Sample_information.txt).

---

As the core datasets, we have in total 72 runs for core cell lines using three different Nanopore RNA-Sequencing prototocols.

As an initial release, we are providing fastq and bam files. You can sign up for the sg-nex-updates email list to receive notifications about upcoming data releases:

https://groups.google.com/forum/#!forum/sg-nex-updates/join

Please see below for the downloading links:
- fastq: [fastq](https://www.dropbox.com/sh/q098af3xdzfqc72/AAA-UhZGSvmez5pOdZIN2mpRa?dl=0)
- bam: [genomeBam](https://www.dropbox.com/sh/mjzbtp31cgtxato/AACPTouVgMztbArwTP9Yt0zCa?dl=0), [transcriptomeBam](https://www.dropbox.com/sh/cuyicuormo809fx/AAA9ndo8BWvGRjaByWKvrALIa?dl=0)

Detailed information on sample ids and corresponding sample attributes can be found [here](docs/Sample_information.txt).

Notes on data usage: This site provides early access to the SG-NEx data for research. Please note that the data is under publication embargo until the SG-NEx project is published.
**_Notes on data usage_**: This site provides early access to the SG-NEx data. These data can be used in research and publications, but we ask data users to refrain from publishing a systematic comparison that is described in the pre-print until the final manuscript is published. If you are uncertain, please feel free to reach out (https://github.com/GoekeLab/sg-nex-data/#contact). You can sign up for the sg-nex-updates email list to receive notifications about upcoming data releases: https://groups.google.com/forum/#!forum/sg-nex-updates/join. If you use the SG-NEx data in your research, please specify the [release version](https://github.com/GoekeLab/sg-nex-data/#data-download) and cite the pre-print (see [citation](https://github.com/GoekeLab/sg-nex-data/#citing-the-SG-NEx-project)).
47 changes: 41 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# SG-NEx - The Singapore Nanopore-Expression Project
![The Singapore Nanopore-Expression Project\!](
https://jglaborg.files.wordpress.com/2021/10/sg_nex_textlogo.png)

[![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/GoekeLab/sg-nex-data?color=blue&include_prereleases)](#data-download)

The SG-NEx project is an international collaboration that was initiated at the [Genome Institute of Singapore](https://www.a-star.edu.sg/gis/). The aim of the SG-NEx Project is to generate reference transcriptomes for 5 of the most commonly used cancer cell lines using Nanopore long read RNA-Seq data:

Expand All @@ -7,28 +10,52 @@ https://jglaborg.files.wordpress.com/2020/10/sg_nex_design-1.png)

Transcriptome profiling is done using PCR-cDNA sequencing ("PCR-cDNA"), amplification-free cDNA sequencing ("direct cDNA"), direct sequencing of native RNA (“direct RNA”), and short read RNA-Seq. All samples are sequenced with at least 3 high quality replicates. For a subset of samples, we used sequin spike-in RNAs.

## Content

- [Email list](#sign-up-for-data-release-notifications-and-updates)
- [Data Download and Release History](#data-download)
- [Data Processing](#data-processing)
- [Use Cases and Applications](#use-cases-and-applications)
- [Data Access Tutorials](#data-access-tutorials)
- [Contributors](#contributors)
- [Citing the SG-NEx project](#citing-the-sg-nex-project)
- [Contact](#contact)

## Sign up for data release notifications and updates
You can sign up for the sg-nex-updates email list to receive notifications about upcoming data releases:

https://groups.google.com/forum/#!forum/sg-nex-updates/join

## Data Releases
## Data Download

**Pre-Release (v0.1)**

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4159715.svg)](https://doi.org/10.5281/zenodo.4159715)

Data can be downloaded [here](DATA.md)
Notes on data usage: This site provides early access to the SG-NEx data for research. Please note that the data is under publication embargo until the SG-NEx project is published.

_**Notes on data usage**_: This site provides early access to the SG-NEx data. These data can be used in research and publications, but we ask data users to refrain from publishing a systematic comparison that is described in the pre-print until the final manuscript is published. If you are uncertain, please feel free to reach out ([Contact](#contact)).

**Release History**

You can find previous releases here in the [release history](https://github.com/GoekeLab/sg-nex-data/releases)

## Data Processing

We collaborated with [nf-core](https://github.com/nf-core) to develop [nanoseq](https://github.com/nf-core/nanoseq), a standardardized pipeline for Nanopore RNA-Seq data processing.

**Reference files**

## Reference files
Details on reference files can be found [here](ANNOTATIONS.md).

## Use Cases and Applications

You can browse a list of articles using the SG-NEx data in research [here](SGNEx_usecases.md)

## Data Access Tutorials

Coming soon! Please refer to [Data Download](#data-download) in the meantime.

## Contributors

**GIS Sequencing Platform and Data Generation**
Expand All @@ -38,10 +65,18 @@ Hwee Meng Low, Yao Fei, Sarah Ng, Wendy Soon, CC Khor
Viktoriia Iakovleva, Puay Leng Lee, Lixia Xin, Hui En Vanessa Ng, Jia Min Loo, Xuewen Ong, Hui Qi Amanda Ng, Suk Yeah Polly Poon, Hoang-Dai Tran, Kok Hao Edwin Lim, Huck Hui Ng, Boon Ooi Patrick Tan, Huck-Hui Ng, N.Gopalakrishna Iyer, Wai Leong Tam, Wee Joo Chng, Leilei Chen, Ramanuj DasGupta, Yun Shen Winston Chan, Qiang Yu, Torsten Wüstefeld, Wee Siong Sho Goh

**Statistical Modeling and Data Analytics**
Chen Ying, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
Ying Chen, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke

## Citing the SG-NEx project

If you use the SG-NEx data in your research, please specify the [release version](#data-download) and cite the pre-print that describes this data resource:

Chen, Ying, et al. "A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines." _bioRxiv_ (2021). doi: https://doi.org/10.1101/2021.04.21.440736

Please see the note on data usage (under [Data Download](#data-download)).

## Contact

Questions about SG-NEx? Please contact [Jonathan Göke](https://www.a-star.edu.sg/gis/our-people/faculty-staff)
Questions about SG-NEx? Please add an entry in the [Discussions Forum](https://github.com/GoekeLab/sg-nex-data/discussions). You can also contact [Jonathan Göke](https://www.a-star.edu.sg/gis/our-people/faculty-staff)

![The Singapore Nanopore-Expression Project\!](https://jglaborg.files.wordpress.com/2020/10/sg_nex_logos-1.png)
24 changes: 24 additions & 0 deletions SGNEx_usecases.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
### Use Cases and Applications: Research articles using the SG-NEx data (pre-release versions)

This site lists some examples how the SG-NEx data resource is used in research:

#### Transcript discovery/quantification

- Schulz, Laura, et al. "Direct long-read RNA sequencing identifies a subset of questionable exitrons likely arising from reverse transcription artifacts." _Genome Biology_ 22.1 (2021): 1-12. https://doi.org/10.1186/s13059-021-02411-1
- Annaldasula, Siddharth, Martyna Gajos, and Andreas Mayer. "IsoTV: processing and visualizing functional features of translated transcript isoforms." _Bioinformatics_ (2021). https://doi.org/10.1093/bioinformatics/btab103

#### RNA modifications

- Pratanwanich, Ploy N., et al. "Identification of differential RNA modifications from nanopore direct RNA sequencing with xPore." _Nature Biotechnology_ (2021): 1-9. https://doi.org/10.1038/s41587-021-00949-w
- Hendra, Christopher, et al. "Detection of m6A from direct RNA sequencing using a Multiple Instance Learning framework." _bioRxiv_ (2021). https://doi.org/10.1101/2021.09.20.461055
- Campos, João H., et al. "Direct RNA sequencing reveals SARS-CoV-2 m6A sites and possible differential DRACH motif methylation among variants." _bioRxiv_ (2021). https://doi.org/10.1101/2021.08.24.457397

#### Fusion detection

- Davidson, Nadia M., et al. "JAFFAL: Detecting fusion genes with long read transcriptome sequencing." _bioRxiv_ (2021). https://doi.org/10.1101/2021.04.26.441398

#### Reviews and other use cases

- De Paoli-Iseppi, Ricardo, Josie Gleeson, and Michael B. Clark. "Isoform age-splice isoform profiling using long-read technologies." Frontiers in Molecular Biosciences 8 (2021). https://doi.org/10.3389/fmolb.2021.711733

Please feel free to add more examples by creating a pull request.
Loading

0 comments on commit 610dc24

Please sign in to comment.