-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #93 from pepkit/dev
Release v0.11.0
- Loading branch information
Showing
29 changed files
with
19,759 additions
and
1,682 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ name: Run codecov | |
|
||
on: | ||
pull_request: | ||
branches: [master] | ||
branches: [master, dev] | ||
|
||
jobs: | ||
pytest: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,22 @@ | ||
# <img src="https://raw.githubusercontent.com/pepkit/geofetch/master/docs/img/geofetch_logo.svg?sanitize=true" alt="geofetch logo" height="70"> | ||
|
||
[![PEP compatible](http://pepkit.github.io/img/PEP-compatible-green.svg)](http://pepkit.github.io) | ||
[![PEP compatible](https://pepkit.github.io/img/PEP-compatible-green.svg)](https://pepkit.github.io) | ||
![Run pytests](https://github.com/pepkit/geofetch/workflows/Run%20pytests/badge.svg) | ||
[![docs-badge](https://readthedocs.org/projects/geofetch/badge/?version=latest)](http://geofetch.databio.org/en/latest/) | ||
[![docs-badge](https://readthedocs.org/projects/geofetch/badge/?version=latest)](https://geofetch.databio.org/en/latest/) | ||
[![pypi-badge](https://img.shields.io/pypi/v/geofetch)](https://pypi.org/project/geofetch) | ||
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) | ||
|
||
`geofetch` is a command-line tool that downloads sequencing data and metadata from GEO and SRA and creates [standard PEPs](http://pep.databio.org/). `geofetch` is hosted at [pypi](https://pypi.org/project/geofetch/) and documentation is hosted at [geofetch.databio.org](http://geofetch.databio.org) (source in the [/docs](/docs) folder). | ||
`geofetch` is a command-line tool that downloads sequencing data and metadata from GEO and SRA and creates [standard PEPs](https://pep.databio.org/). `geofetch` is hosted at [pypi](https://pypi.org/project/geofetch/). You can convert the result of geofetch into unmapped `bam` or `fastq` files with the included `sraconvert` command. | ||
|
||
You can convert the result of geofetch into unmapped `bam` or `fastq` files with the included `sraconvert` command. | ||
Key geofetch features: | ||
|
||
- Works with GEO and SRA metadata | ||
- Combines samples from different projects | ||
- Standardizes output metadata | ||
- Filters type and size of processed files (from GEO) before downloading them | ||
- Easy to use | ||
- Fast execution time | ||
- Can search GEO to find relevant data | ||
- Can be used either as a command-line tool or from within Python using an API | ||
|
||
For more information, see [complete documentation at geofetch.databio.org](http://geofetch.databio.org) (source in the [/docs](/docs) folder). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
is a geofetch class that provides functions to find and retrieve a list of GSE ([GEO](https://www.ncbi.nlm.nih.gov/geo/) accession number) by using NCBI searching tool. | ||
|
||
|
||
### The main features of the geofetch Finder are: | ||
- Find GEO accession numbers (GSE) of the project that were uploaded or updated in certain period of time. | ||
- Use the same filter query as [GEO DataSets Advanced Search Builder](https://www.ncbi.nlm.nih.gov/gds/advanced) is using | ||
- Save list of the GSEs to file (This file with geo can be used later in **[geofetch](http://geofetch.databio.org/en/latest/)**) | ||
- Easier and faster to get GSEs using NCBI filter and certain period of time. | ||
|
||
|
||
___ | ||
## Tutorial | ||
|
||
0) Initiale Finder object. | ||
```python | ||
from geofetch import Finder | ||
gse_obj = Finder() | ||
|
||
# Optionally: provide filter string and max number of retrieve elements | ||
gse_obj = Finder(filter="((bed) OR narrow peak) AND Homo sapiens[Organism]", retmax=10) | ||
``` | ||
|
||
1) Get list of all GSE in GEO | ||
```python | ||
|
||
gse_list = gse_obj.get_gse_all() | ||
|
||
``` | ||
|
||
2) Get list of GSE that were uploaded and updated last week | ||
```python | ||
|
||
gse_list = gse_obj.get_gse_last_week() | ||
|
||
``` | ||
|
||
3) Get list of GSE that were uploaded and updated last 3 month | ||
```python | ||
|
||
gse_list = gse_obj.get_gse_last_3_month() | ||
|
||
``` | ||
|
||
4) Get list of GSE that were uploaded and updated in las *number of days* | ||
```python | ||
|
||
# project that were uploaded in last 5 days: | ||
gse_list = gse_obj.get_gse_by_day_count(5) | ||
|
||
``` | ||
|
||
5) Get list of GSE that were uploaded in certain period of time | ||
```python | ||
|
||
gse_list = gse_obj.get_gse_by_date(start_date="2015/05/05", end_date="2020/05/05") | ||
|
||
``` | ||
|
||
6) Save last searched list of items to the file | ||
```python | ||
|
||
gse_obj.generate_file("path/to/the/file") | ||
|
||
# if you want to save different list of files you can provide it to the funciton | ||
gse_obj.generate_file("path/to/the/file", gse_list=["123", "124"]) | ||
|
||
``` | ||
|
||
7) Compare two lists: | ||
```python | ||
|
||
new_gse_list = gse_obj.find_differences(list1, list2) | ||
|
||
``` | ||
|
||
---- | ||
|
||
More information about gse and queries and id: | ||
- https://www.ncbi.nlm.nih.gov/geo/info/geo_paccess.html | ||
- https://newarkcaptain.com/how-to-retrieve-ncbi-geo-information-using-apis-part1/ | ||
- https://www.ncbi.nlm.nih.gov/books/NBK3837/#EntrezHelp.Using_the_Advanced_Search_Pag |
Oops, something went wrong.