diff --git a/docs/bedboss/tutorials/bedbuncher_tutorial.md b/docs/bedboss/tutorials/bedbuncher_tutorial.md index f32c8b8..3b18380 100644 --- a/docs/bedboss/tutorials/bedbuncher_tutorial.md +++ b/docs/bedboss/tutorials/bedbuncher_tutorial.md @@ -1 +1,29 @@ -### 🚧 Tutorial in progress! Stay tuned for updates. We're working hard to bring you valuable content soon! \ No newline at end of file +### BEDbuncher + +Bedbuncher is used to create bedset of bed files in the bedbase database. + +### 1) Create bedbase config file +### 2) Create pep with bed file record identifiers. +To do so, you need to create a PEP with the following fields: sample_name (where sample_name is record_identifier), or `sample_name` + `record_identifier` +e.g. sample_table: + +| sample_name | record_identifier | +|----------|----------| +| sample1 | asdf3215f34 | +| sample2 | a23452f34tf | + +### 3) Run bedboss bunch +#### From command line +```bash +bedboss bunch \ + --bedbase-config path/to/bedbase_config.yaml \ + --bedset-name bedset1 \ + --pep path/to/pep.yaml \ + --bedset-pep bedset_pep.yaml \ + --cache-path CACHE_PATH +``` + +### Run bedboss bunch from within Python +```python + +``` \ No newline at end of file diff --git a/docs/bedboss/tutorials/bedindex_tutorial.md b/docs/bedboss/tutorials/bedindex_tutorial.md index f32c8b8..1e58111 100644 --- a/docs/bedboss/tutorials/bedindex_tutorial.md +++ b/docs/bedboss/tutorials/bedindex_tutorial.md @@ -1 +1,21 @@ -### 🚧 Tutorial in progress! Stay tuned for updates. We're working hard to bring you valuable content soon! \ No newline at end of file +### Indexing to qdrant database + +### 1. Create bedbase config file +### 2. Run bedboss index + +#### From command line +```bash +bedboss index --bedbase-config path/to/bedbase_config.yaml +``` + +After running this comman all files that are in the database and weren't indexed will be indexed to qdrant database. + + +#### From within Python +```python +from bedboss.qdrant_index import add_to_qdrant + +add_to_qdrant( + bedbase_config="path/to/bedbase_config.yaml" +) +``` \ No newline at end of file diff --git a/docs/bedboss/tutorials/tutorial_insert.md b/docs/bedboss/tutorials/tutorial_insert.md index 2073bd3..6105a38 100644 --- a/docs/bedboss/tutorials/tutorial_insert.md +++ b/docs/bedboss/tutorials/tutorial_insert.md @@ -1,13 +1,13 @@ ## Bedboss insert -Bedboss insert is intended to run each sample in provided PEP. -PEP can be provided as a file or as a registry path of the PEPhub. +Bedboss insert is designed to process each sample in the provided PEP. +The PEP can be provided either as a path to config file or as a registry path of the PEPhub. ### Step 1: Install all dependencies -First you have to install bedboss and check if all requirements are satisfied. -To do so, you can run next command: +First, you have to install bedboss and check if all requirements are satisfied. +To do so, you can run the following command: ```bash bedboss requirements-check ``` @@ -15,7 +15,7 @@ If requirements are not satisfied, you will see the list of missing packages. ### Step 2: Create bedconf.yaml file To run bedboss insert, you need to create a bedconf.yaml file with configuration. -Detail instructions are in the configuration section. +Detailed instructions are in the configuration section. ### Step 3: Create PEP with bed files. BEDboss PEP should contain next fields: sample_name, input_file, input_type, genome. @@ -33,14 +33,14 @@ bedboss insert \ ``` -Above command will run bedboss on the bed file and create a bedstat file in the output directory. +Above command will run bedboss on the bed file and create a file with statistics in the output directory. It contains only required parameters. For more details, please check the usage section. -By default, results will be uploaded only to postgres database. -- To upload results to PEPhub, you need to make `databio` org available on GitHub, then login to PEPhub, and add `--upload-pephub` flag to the command. -- To upload results to Qdrant, you need to add `--upload-qdrant` flag to the command. -- To upload actual files to s3, you need to add `--upload-s3` flag to the command, and Before uploading you have to set up all necessary env vars: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_ENDPOINT_URL. -- To create bedset of provided pep files, you need to add `--create-bedset` flag to the command. +By default, results will be uploaded only to the PostgreSQL database. +- To upload results to PEPhub, you need to make the `databio` org available on GitHub, then login to PEPhub, and add the `--upload-pephub` flag to the command. +- To upload results to Qdrant, you need to add the `--upload-qdrant` flag to the command. +- To upload actual files to S3, you need to add the `--upload-s3` flag to the command, and before uploading, you have to set up all necessary environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_ENDPOINT_URL. +- To create a bedset of provided pep files, you need to add the `--create-bedset` flag to the command. --- diff --git a/docs/bedhost/README.md b/docs/bedhost/README.md index a6a53bf..f3235fa 100644 --- a/docs/bedhost/README.md +++ b/docs/bedhost/README.md @@ -1,4 +1,24 @@ -# BEDhost API guide +

bedhost

+ +[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) +[![Github badge](https://img.shields.io/badge/source-github-354a75?logo=github)](https://github.com/databio/bedhost) + + +`bedhost` is a Python FastAPI module for the API that powers BEDbase +It needs a path to the *bedbase configuration file*, which can be provided either via `-c`/`--config` argument or read from `$BEDBASE_CONFIG` environment variable. + +--- + +**Deployed public instance**: https://bedbase.org/ + +**Documentation**: https://docs.bedbase.org/bedhost + +**API**: https://api.bedbase.org/ + +**Source Code**: https://github.com/databio/bedhost/ + +--- + ## Introduction diff --git a/docs/bedhost/changelog.md b/docs/bedhost/changelog.md index 7de2f4d..b17414d 100644 --- a/docs/bedhost/changelog.md +++ b/docs/bedhost/changelog.md @@ -2,6 +2,13 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) and [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) format. +## [0.3.0] -- 2023-03-01 +### change +- switch to pydantic2 +- updated requirements +- updated docs + + ## [0.2.0] -- 2023-10-17 - remove all graphql - remove local static hosting of UI diff --git a/docs/bedhost/dev-guide.md b/docs/bedhost/dev-guide.md new file mode 100644 index 0000000..1c8899c --- /dev/null +++ b/docs/bedhost/dev-guide.md @@ -0,0 +1,35 @@ +# Developer Guide + +## Introduction + +### Data types + +BEDbase stores two types of data, which we call *records*. They are 1. BEDs, and 2. BEDsets. BEDsets are simply collections of BEDs. Each record in the database is either a BED or a BEDset. + +### Endpoint organization + +The endpoints are divided into 3 groups: + +1. `/bed` endpoints are used to interact with metadata for BED records. +2. `/bedset` endpoints are used to interact with metadata for BEDset records. +3. `/objects` endpoints are used to download metadata and get URLs to retrieve the underlying data itself. These endpoints implement the [GA4GH DRS standard](https://ga4gh.github.io/data-repository-service-schemas/). + +Therefore, to get information and statistics about BED or BEDset records, or what is contained in the database, look through the `/bed` and `/bedset` endpoints. But if you need to write a tool that gets the actual underlying files, then you'll need to use the `/objects` endpoints. The type of identifiers used in each case differ. + +## Record identifiers vs. object identifiers + +Each record has an identifier. For example, `eaf9ee97241f300f1c7e76e1f945141f` is a BED identifier. You can use this identifier for the metadata endpoints. To download files, you'll need something slightly different -- you need an *object identifier*. This is because each BED record includes multiple files, such as the original BED file, the BigBed file, analysis plots, and so on. To download a file, you will construct what we call the `object_id`, which identifies the specific file. + +## How to construct object identifiers + +Object IDs take the form `..`. An example of an object_id for a BED file is `bed.eaf9ee97241f300f1c7e76e1f945141f.bedfile` + +So, you can get information about this object like this: + +`GET` [/objects/bed.eaf9ee97241f300f1c7e76e1f945141f.bedfile](/objects/bed.eaf9ee97241f300f1c7e76e1f945141f.bedfile) + +Or, you can get a URL to download the actual file with: + +`GET` [/objects/bed.eaf9ee97241f300f1c7e76e1f945141f.bedfile/access/http](/objects/bed.eaf9ee97241f300f1c7e76e1f945141f.bedfile/access/http) + + diff --git a/docs/geniml/README.md b/docs/geniml/README.md index d38773c..d2eac23 100644 --- a/docs/geniml/README.md +++ b/docs/geniml/README.md @@ -1,7 +1,10 @@ -# +

+ +

+

- +

diff --git a/mkdocs.yml b/mkdocs.yml index 621f1b1..3b33496 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -47,6 +47,7 @@ nav: - API guides: - BEDhost API guide: - BEDhost: bedhost/README.md + - Developer Guide: bedhost/dev-guide.md - Changelog: bedhost/changelog.md - BBConf: - BBConf: bbconf/README.md @@ -63,8 +64,8 @@ nav: - BEDboss: - BEDBoss: bedboss/README.md - Tutorial: - - BEDboss-all pipeline: bedboss/tutorials/tutorial_all.md - BEDboss insert: bedboss/tutorials/tutorial_insert.md + - BEDboss-all pipeline: bedboss/tutorials/tutorial_all.md - BEDmaker tutorial: bedboss/tutorials/bedmaker_tutorial.md - BEDqc tutorial: bedboss/tutorials/bedqc_tutorial.md - BEDstat tutorial: bedboss/tutorials/bedstat_tutorial.md