You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A recurrent work-flow in the User Stories 2 and 3 is to select a number of samples and then click on SNPs to sort the displayed samples by their genotype values at those SNPs. This enables the user to identify a haplotype using Genotype values, i..e Alt / Ref, at those SNPs and find samples which have the chosen haplotype .
This operation sorts the samples which are loaded into the frontend GUI, based on the Genotype values which are loaded in the frontend.
For a dataset with 300 samples this is fine, but with AGG datasets with 30000 samples, it becomes necessary to do this operation on the server ("backend") to effectively say how many samples in the collection have the identified haplotype, and retrieve their genotype values.
This issue discusses a proposed feature enabling this work-flow :
after identifying the desired haplotype by selecting SNPs, and identifying Alt/Ref at each SNP, the user will be able to view the samples which have that haplotype
the Genotype values can then be requested for these samples
GUI design
A simple way to trial this functionality is to add a button in the Genotype table control dialog which will narrow the list of available samples to just those which have the identified haplotype.
The user can the select some or all samples from this list, and proceed with the VCF Lookup request as in existing work-flows.
Design of lookup functionality on the server
The method of using vcftools to implement this will be based on specifying which SNPs have Alt and Ref, using bcftools --include with --regions (e.g. `bcftools view -e 'GT="0/0"' -r chr1:123456,chr2:7891011' as in this LLM answer.
Task Structure
The work breakdown is as follows :
testing of bcftools commands on VCF files with large number of samples to gauge the performance of this function
add the server endpoint which implements this function
add the button which gets the list of samples and displays it on the available samples list
Each of these will be described and tracked by a sub-issue which will be a part of this issue.
The text was updated successfully, but these errors were encountered:
closes#484, #485, part of #482.
manage-genotype.hbs : add
input checkbox filterSamplesByHaplotype
span.badge .snpsInBrushedDomain.length, with tooltip .snpsInBrushedDomain value{_0,s.{ref,alt}}, .featureFiltersCount, .matchRefNumeric.
use .matchRefNumeric in Sample Filters : Feature / SNP
manage-genotype.js :
add filterSamplesByHaplotype in .args.userSettings, to enable filtering of available samples by selected SNPs
add selectedSNPsInBrushedDomain(), snpsInBrushedDomain() to display SNP count badge.
vcfGenotypeSamplesDataset() : add filterByHaplotype; in this case update .selectedSamples and .selectedSamplesText, otherwise don't update sampleCache.sampleNames and datasetStoreSampleNames().
vcfGenotypeSamples(): don't throttle if .filterSamplesByHaplotype - it's valid for user to repeat the request after changing selected SNps.
ensureSamplesForDatasetTabEffect() : if .filterSamplesByHaplotype then request samples, regardless of .vcfGenotypeSamplesText being already defined.
feature.js : add matchRefNumeric().
auth.js : genotypeSamples() : add param filter, replacing options which is not required.
block.js :
genotypeSamples() : add param filter.
vcfGenotypeSamples() : add param filter, use vcfGenotypeSamplesFiltered() when filter.
vcfGenotypeLookup.bash :
argVal : recognise command-line parameter GT=gtMatch, not added to preArgs.
bcftoolsCommand() : add filter_samples command, implemented as bcftools query | grep gtMatch.
Introduction
A recurrent work-flow in the User Stories 2 and 3 is to select a number of samples and then click on SNPs to sort the displayed samples by their genotype values at those SNPs. This enables the user to identify a haplotype using Genotype values, i..e Alt / Ref, at those SNPs and find samples which have the chosen haplotype .
This operation sorts the samples which are loaded into the frontend GUI, based on the Genotype values which are loaded in the frontend.
For a dataset with 300 samples this is fine, but with AGG datasets with 30000 samples, it becomes necessary to do this operation on the server ("backend") to effectively say how many samples in the collection have the identified haplotype, and retrieve their genotype values.
This issue discusses a proposed feature enabling this work-flow :
GUI design
A simple way to trial this functionality is to add a button in the Genotype table control dialog which will narrow the list of available samples to just those which have the identified haplotype.
The user can the select some or all samples from this list, and proceed with the VCF Lookup request as in existing work-flows.
Design of lookup functionality on the server
The method of using vcftools to implement this will be based on specifying which SNPs have Alt and Ref, using bcftools --include with --regions (e.g. `bcftools view -e 'GT="0/0"' -r chr1:123456,chr2:7891011' as in this LLM answer.
Task Structure
The work breakdown is as follows :
Each of these will be described and tracked by a sub-issue which will be a part of this issue.
The text was updated successfully, but these errors were encountered: