Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BrAPI Wittenberg Hackathon 2024 notes #372

Open
gabrielkg opened this issue Apr 12, 2024 · 7 comments
Open

BrAPI Wittenberg Hackathon 2024 notes #372

gabrielkg opened this issue Apr 12, 2024 · 7 comments

Comments

@gabrielkg
Copy link
Member

Notes for work done as part of the BrAPI Hackathon held April 15-19, 2024 in Wittenberg, Germany. Pretzel team attending remotely.

Aim
Expand support for BrAPI within Pretzel. As a starting point, look to support /allelematrix endpoint in the backend and server to front end.

@gabrielkg gabrielkg changed the title BrAPI Hackathon 2024 notes BrAPI Wittenberg Hackathon 2024 notes Apr 12, 2024
@Don-Isdale
Copy link
Collaborator

Don-Isdale commented Apr 15, 2024

2024Apr12 :

https://github.com/solgenomics/BrAPI-js

Some notes for the BrAPI client side (Pretzel frontend)

https://github.com/solgenomics/BrAPI-js#available-brapi-methods
Available BrAPI Methods

BrAPI.js Method BrAPI Call (<= v1.3) BrAPI Call (v2.0) Default HTTPMethod
node.allelematrices_search(params,...) /allelematrices-search(>=v1.2) or /allelematrix-search(<v1.2) POST
node.allelematrices(params,...) /allelematrices GET

The current BrAPI interface is via these sources :
germinate.js # call the API
germinate-genotype.js # process the data
which are in 2 locations - frontend and backend :
pretzel/frontend/app/utils/
pretzel/lb4app/lb3app/common/utilities/

The allelematrix / allelematrices functions may be added in those files, or possibly separate files depending on how related they are.

I've sketched out some server side functions also. They would utilise these functions to process the data, and use @solgenomics/BrAPI-js to package into BrAPI responses :
pretzel/lb4app/lb3app/common/utilities/vcf-genotype.js

links
germinate.js
germinate-genotype.js
vcf-genotype.js


2024Apr15 :

Overview of API requests and data flows for genotype / germinate / BrAPI

The Genotype Table is implemented by
manage-genotype.js
with subcomponent
matrix-view.js
It sends API requests using the library functions in vcf-feature.js
which in turn use auth.js
These requests are received by block.js
which uses
vcf-genotype.js
to fulfill the requests, via childProcess() defined in the library
child-process.js
and a script which wraps bcftools commands :
vcfGenotypeLookup.bash

Germinate is added as a Data Source in the frontend; this is parallel to the Pretzel server data source which is the primary server for Pretzel requests.
Requests to Germinate are sent using
germinate.js
and
germinate-genotype.js
which use the
@solgenomics/brapijs package

(This Germinate BrAPI connection has been used both on the frontend and the backend - the first implementation connected from the Pretzel server to Germinate to implement the genotype requests from the Pretzel frontend client).

The additions for the hackathon are : use @solgenomics/BrAPI-js in Pretzel frontend to the backend to

  • send BrAPI /allelematrix requests from the frontend to the Pretzel backend
  • process the VCF genotype data in the server into the /allelematrix reply format and return that to the client
  • in the frontend this will share the data path which the Germinate requests use.

Details

Pretzel frontend GUI client

vcf-feature.js :
vcfGenotypeLookup() ->
auth.js :
genotypeSamples()
vcfGenotypeLookup()

components/panel/ manage-genotype.js :
vcfGenotypeSamplesDataset() -> auth.genotypeSamples()

API endpoints :

   const featuresCountEndpoints = [
      'Blocks/blockFeaturesCount',
      'Blocks/blockFeatureLimits'];
    const vcfGenotypeEndpoints = [
      'Blocks/genotypeSamples',
      'Blocks/vcfGenotypeLookup',
    ];

Pretzel Server :

common/models/block.js :

Block. :

  blockFeaturesCounts() -> vcfGenotypeFeaturesCounts()
  germinateGenotypeSamples() -> germinateGenotypeSamples()
  vcfGenotypeLookup() -> vcfGenotypeLookup() or germinateGenotypeLookup()
  vcfGenotypeSamples() -> childProcess( 'vcfGenotypeLookup.bash' )

vcf-genotype.js

vcfGenotypeLookup() -> childProcess( 'vcfGenotypeLookup.bash' )
vcfGenotypeFeaturesCounts() -> vcfGenotypeLookup()
vcfGenotypeFeaturesCountsStatus() ->   childProcess( 'vcfGenotypeLookup.bash' )

Pretzel frontend GUI client (can also be used in Pretzel Server) :

germinate.js

login(username_, password_) {
    ...
    body = {username, password},
    method = 'POST',
    endpoint = 'token';
    ...
    this.fetchEndpoint(endpoint, method, body)

maps() {
  ...
  promise = this.fetchEndpoint(brapi_v + '/maps');

linkagegroups(mapDbId) {
    ...
    this.fetchEndpoint(brapi_v + '/maps/' + mapDbId + '/linkagegroups');

samples(dataset) {
  const samplesP =
  this.fetchEndpoint(brapi_v + '/' + 'callsets/dataset' + '/' + dataset);

callsetsCalls(mapid, callSetDbId, linkageGroupName, start, end, limit_result) {
   endpoint = brapi_v + '/' + 'callsets'  + '/' + callSetDbId + '/calls'
    + '/mapid/' + mapid + intervalParams,
  callsP = this.fetchEndpoint(endpoint);

frontend/app/utils/data/ germinate-genotype.js

function callSetCacheForBlockP(datasetId, scope) {
    germinateGenotypeSamplesP(datasetId, scope)
    .then(samples => callSetCacheForBlock(datasetId, scope));

function germinateGenotypeSamples(datasetId, scope, cb) {
        const name2Id = callSetCacheForBlock(datasetId, scope);

function germinateGenotypeLookup(datasetId, scope, preArgs, nLines, undefined, cb) {
  name2IdP = callSetCacheForBlockP(datasetId, scope);
  ...
        germinate.callsetsCalls(mapid, callSetDbId, linkageGroupName, start, end, nLines)

@Don-Isdale
Copy link
Collaborator

Don-Isdale commented Apr 18, 2024

Following on from Mahdi's post : we can add those endpoints to the express server.

This is a good exercise for people to get familiarity with the server side of pretzel, including npm install / build / run.
There are 3 endpoints, so a good outcome would be if the Hackathon team could add 1 each.
Don't stress if you hit road-blocks - that's normal with these complex build hierarchies, and it's more valuable to get a feel for the landscape rather than needing to complete the endpoint.

The express test server is @plantinformatics/vcf-genotype-brapi/test/server/
The basic process is :

  • install npm (any version after node version 16 is OK; using nvm to manage versions is desirable if you are building in various contexts, e.g. install nvm, then nvm install 16 )
  • npm start (can build separately, e.g. noted in this README.md
  • send an allelematrix request to the server
    e.g. test1.json
    you can send the json with Postman, or this bash script which uses curl.

e.g. test server with :

  cd  vcf-genotype-brapi/test/clients/bash/
  source allelematrix.bash
  serverUrl=localhost:3000
  sendRequest allelematrix test1.json

The server has port 3000 hard-wired; another useful task would be to add configuration parameters for port and vcfDir, perhaps others, using command-line arg parsing or environment variables,
e.g. $API_PORT_GT_BRAPI, $vcfDir
via e.g. process.env.API_PORT_GT_BRAPI || 3000
related : API_PORT_EXT in pretzel / :
server.js
config.local.js

At this stage we just want to receive and trace the request parameters, and send back a static json.
We'll use vcf-feature.js for requesting and parsing the VCF, to serve the /allelematrix request.

@Don-Isdale
Copy link
Collaborator

Don-Isdale commented Apr 18, 2024

Also the data model for BrAPI and VCF are fairly different, and we need to think about how we will map the request parameters to the VCF filename + chr name + sample names.

These parameters I think have simple interpretations :

BrAPI parameter VCF parameter
variantDbIds SNP name
positionRanges Chr name + reference position [start, end]
markerDbId Marker name

Maybe someone can think about how these relate :

BrAPI parameter VCF parameter
markerprofileDbId
callSetDbIds Sample name
variantSetDbId

The goal is to be able to translate the BrAPI /allelematrix request parameters to bcftools request parameters.

@Don-Isdale
Copy link
Collaborator

Don-Isdale commented Apr 18, 2024

Design diagram showing the Genotype web API requests : Custom API and BrAPI

Custom API

sequenceDiagram
  participant C as Pretzel Frontend GUI
  participant S as Pretzel Server

  C->>S: Blocks/vcfGenotypeLookup
  create participant B as bcftools
  S->>B: bcftools query or view
  destroy B
  B->>S: VCF result
  S->>C: VCF result
Loading

BrAPI

sequenceDiagram

  participant C as Pretzel Frontend GUI
  participant S as Pretzel Server

  C->>S: brapi/v2/search/allelematrix
  create participant B as bcftools
  S->>B: bcftools query or view 
  destroy B
  B->>S: VCF result
  S->>C: allelematrix result

Loading

For editing those mermaid diagrams: doc
and a post re. including those in github markdown comments.

@Don-Isdale
Copy link
Collaborator

Status :

  • factored out to a separate npm package repository the library functions which request genotype data from bcftools, process the VCF data.
    This enables them to be used either in the Pretzel Server or in a separate micro-server

  • Using BrAPI from the frontend GUI enables collation of datasets from additional data sources.

  • created the framework of a BrAPI server (node.js / express) for /allelematrix,
    with test data for VCFs and client requests, and a simple test client.

  • The commits for this are :
    plantinformatics/vcf-genotype-brapi@feature/upgradeFrontend...develop

  • The hackathon yielded broadening of capability, with team members working across multiple parts of the development context and tools

  • focusing on how to use BrAPI in this use case has yielded a clear documented understanding of :

    • variations between versions v1 and v2 and endpoints
    • returned data types - v2 dataMatrix is more compact than v1 result
    • which tools support those version variations, in particular : Gigwa and BrAPI.js
    • the sequence of steps in interacting with the API, and the required parameter data values

Using BrAPI from the frontend GUI enables collation of datasets from additional data sources :

classDiagram

    PretzelFrontendGUI <|-- PretzelServer
    PretzelFrontendGUI <|-- Gigwa
    PretzelFrontendGUI <|-- local_VCF_Genotype_Brapi_micro_server
    PretzelFrontendGUI <|-- DivBrowse
    PretzelFrontendGUI : multiple BrAPI Data Sources
    class Gigwa{
    }

Loading

@Don-Isdale
Copy link
Collaborator

Don-Isdale commented May 8, 2024

Achievements to date:

  • Ability to connect Pretzel web app GUI to a BrAPI server, a self-hosted Gigwa instance, as an additional data source.
  • request server information : /serverinfo
  • request the list of datasets : /search/variantsets
  • request the list of chromosomes of the dataset selected by the user : /search/references
  • request the list of samples of a selected dataset : /search/samples
  • request genotype values of a selected dataset and chromosome, for selected samples, and in a selected interval : /allelematrix
  • request the positions of variants received via /allelematrix : /search/variants

  • 2024Apr23

Ability to connect Pretzel web app GUI to a BrAPI server, a self-hosted Gigwa instance, as an additional data source.
Screen-shot shows the 'New Datasource' dialog, with type 'BrAPI' selected :

Screenshot from 2024-04-23 10-05-55

Display after connecting to a BrAPI datasource :

Screenshot from 2024-04-23 11-23-02

API requests which are sent, based on existing connection type 'Germinate'.
API interaction : /token, /maps, /allelematrices :

Screenshot from 2024-04-23 14-34-00


  • 2024Apr30

Result of requesting datasets from BrAPI server, and after the user selects one dataset, requesting the list of chromosomes of the dataset :

Screenshot from 2024-04-30 09-21-38


  • 2024May02

Result of requesting samples of a selected dataset from BrAPI server :

Screenshot from 2024-05-02 04-33-42

API interaction : request samples of a selected dataset :

Screenshot from 2024-05-02 09-41-00


  • 2024May06

  • screen1 : API interaction : request /allelematrix of a selected dataset and chromosome, for selected samples, and in a selected interval.
    The dataMatrices : dataMatrix shows the genotype values, in numeric form.

  • screen2 : Display of genotype values from /allelematrix result

Screenshot from 2024-05-06 09-20-44


  • 2024May07

Display of SNP / Variant additionalInfo from /allelematrix result :

Screenshot from 2024-05-07 12-15-07

API interaction : /allelematrix result, showing SNP / Variant additionalInfo in result.data additionalInfo and alternateBases and referenceBases :

Screenshot from 2024-05-07 12-10-28


Next steps:

  • Test connectivity to other BrAPI genotype servers

Source repository branch

The work for the BrAPI Hackathon 2024 was done in :

Pretzel branch : feature/useBrAPI
Library package repositories :


@Don-Isdale
Copy link
Collaborator

Further configuration of package dependencies for library factored out to @plantinformatics/vcf-genotype-brapi

ac6b894 add plantinformatics dependencies, update package version to 1.0.4
73f37f6 update package version to 1.0.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants