Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add server endpoint for requesting samples filtered by a given haplotype #484

Open
Don-Isdale opened this issue Feb 21, 2025 · 1 comment
Assignees

Comments

@Don-Isdale
Copy link
Collaborator

Requirements

This endpoint will extend on the existing genotypeSamples, which takes datasetId and scope as parameters and returns a list of sample names.
It will have an added parameter which is a list of SNPs identified by position, and for each SNP whether the genotype value of the sample should match Ref or Alt. The returned list of samples will be filtered to those Samples whose genotype value at each SNP matches the required value.

@Don-Isdale Don-Isdale self-assigned this Feb 21, 2025
Don-Isdale added a commit that referenced this issue Feb 24, 2025
closes #484, #485, part of #482.

manage-genotype.hbs : add
 input checkbox filterSamplesByHaplotype
 span.badge .snpsInBrushedDomain.length, with tooltip .snpsInBrushedDomain value{_0,s.{ref,alt}}, .featureFiltersCount, .matchRefNumeric.
 use .matchRefNumeric in Sample Filters : Feature / SNP

manage-genotype.js :
 add filterSamplesByHaplotype in .args.userSettings, to enable filtering of available samples by selected SNPs
 add selectedSNPsInBrushedDomain(), snpsInBrushedDomain() to display SNP count badge.
 vcfGenotypeSamplesDataset() : add filterByHaplotype; in this case update .selectedSamples and .selectedSamplesText, otherwise don't update sampleCache.sampleNames and datasetStoreSampleNames().
  vcfGenotypeSamples(): don't throttle if .filterSamplesByHaplotype - it's valid for user to repeat the request after changing selected SNps.
  ensureSamplesForDatasetTabEffect() : if .filterSamplesByHaplotype then request samples, regardless of .vcfGenotypeSamplesText being already defined.

feature.js : add matchRefNumeric().

auth.js : genotypeSamples() : add param filter, replacing options which is not required.
block.js :
  genotypeSamples() : add param filter.
  vcfGenotypeSamples() : add param filter, use vcfGenotypeSamplesFiltered() when filter.

vcfGenotypeLookup.bash :
  argVal : recognise command-line parameter GT=gtMatch, not added to preArgs.
   bcftoolsCommand() : add filter_samples command, implemented as bcftools query | grep gtMatch.
@Don-Isdale
Copy link
Collaborator Author

Changes Implemented

branch feature/vcfDownload

e4c0dbe Find Samples from selected haplotype

d353ac2 drop exports. in child-process

7c87f66 add vcfGenotypeSamplesFiltered
84400c4 update dependency and version

46c97c3 update dependency version and package-lock in lb4app

The commit e4c0dbe includes work done for the changes to the frontend GUI, identified in the sibling issue#485.

Test results

After making these changes, the genotypeSamples request with a filter parameter given resulted in a reduced list of sample names :

http://localhost:3000/api/Blocks/genotypeSamples?id=63184b7d4a9ec986e3530ed7&datasetId=..._samples_XT_exomeIDs&scope=1A&filter%5Bfeatures%5D%5B0%5D%5Bposition%5D=3904729&filter%5Bfeatures%5D%5B0%5D%5BmatchRef%5D=false&filter%5Bfeatures%5D%5B1%5D%5Bposition%5D=1337345&filter%5Bfeatures%5D%5B1%5D%5BmatchRef%5D=true

id=63184b7d4a9ec986e3530ed7
datasetId=..._samples_XT_exomeIDs
scope=1A
filter%5Bfeatures%5D%5B0%5D%5Bposition%5D=3904729
filter%5Bfeatures%5D%5B0%5D%5BmatchRef%5D=false
filter%5Bfeatures%5D%5B1%5D%5Bposition%5D=1337345
filter%5Bfeatures%5D%5B1%5D%5BmatchRef%5D=true

{"text":{"text":"ExomeCapture-...3928\n"}}

This is an early test result; in later versions a change was made to give just one level of {text : } wrapper.

Screenshot after above changes

Image

Don-Isdale added a commit that referenced this issue Feb 25, 2025
part of #484.
manage-genotype.js :
  samples() : if .filterSamplesByHaplotype, return this.sampleCache.filteredByGenotype[this.lookupDatasetId] i.e. the dataset samples filtered by the most recent selected SNPs + genotype, i.e. allele / haplotype.  Add dependencies for that value.
  vcfGenotypeSamplesDataset() : if ! urlOptions.filterSelectedSamples, don't filter the selected samples.   Update .sampleCache.filteredByGenotype{,Count}.

vcf-genotype.js : (sampleCache) : add filteredByGenotype{,Count} : caches the result of vcfGenotypeSamplesDataset() when filterByHaplotype.  -Count is an efficient update dependency.
vcfGenotypeLookup.bash : bcftoolsCommand() : grep returns status 1 if there are no matches.  Ignore that and return 0 (true).  Also match end-of-line; matching either eol or tab is sufficient.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant