Releases · moka-guys/automate_demultiplex

22 Nov 15:45

natashapinto

v45.2.0

1121968

V45.2.0 Latest

Latest

This update incorporates the following changes:

Update mokapipe workflow ID and pipeline ID
Update ED VCP3 panel of normals Rdata file
Test emails redirected to mokaguys
Run the docker images as the mokaguys user instead of root. This means that the files created are owned by mokaguys too, making them easier to delete with the workstation cleaner- currently this is failing because of user permissions issues

Assets 2

11 Jul 16:34

RachelDuffin

v45.1.0

04b00fb

v45.1.0

This update incorporates the following changes:

Update upload agent version to v 1.5.33 in line with the version used for archer archiving, as per DNAnexus recommendation
Add functionality to process OncoDEEP runs (using the live OncoDEEP pan number), including oncodeep_upload v1.0.0 app
Address jinja2 dependabot security alert – update jinja2 to v3.1.4
Update naming of ‘master’ branch to ‘main’ - ‘main’ is the current widely accepted nomenclature for the primary branch
Update oncology ops email used as the recipient for the pipeline started emails
SQMove copying of the samplesheet from setoff_workflows module to demultiplexing module
Add auth string variable assignment to top of all commands files – simplifies creation / writing of commands to file
Remove TSO low throughput pan number (Pan4969)
Update duty csv version to v1.5v.0
Update samplesheet validator to v1.3.0
Fix white space addition in SQL emails (Mail clients don’t always respect <p> html tags)
Update pipeline IDs (Custom Panels – ID in use was for a previous version of the pipeline, Archer – MultiQC was not listed in previous ID, TSO – previous ID had no information listed about the app versions, OncoDEEP – new pipeline)
Set logger mode to 'w' so logging overwrites old logs
Add functionality to process development runs without UMIs (demultiplex, then upload the runfolders including BCLs, and set off fastqc / multiqc)
Add new development pan number for development runs with UMIs (Pan5227), and functionality to handle these (creates flag files to prevent demultiplexing and runfolder upload)
Add UMI dev runfolder to config
Addition to toolbox of RunfolderSamples class (derived from CollectRunfolderClass class in setoff_workflows) containing runfolder properties derived from the samplesheet, and SampleObject class originally in setoff_workflows
Alter behaviour to prevent firing of md5sum absence warnings (as there is a period of time after sequencing finishes before the file appears)
Fix so that SampleSheet is checked prior to the run finishing sequencing
Fix issue where setoff_workflows falls over for TSO runs as the samplehseet has not been copied over to the runfolder from the samplesheets directory
Fix issue where setoff workflows tries to automatically upload runs with the development pan number
Make setoff_workflows more modular: (Creation of classes per run type for collating commands generation, to make the logic easier to follow, Move dx run command functions to new build_dx_commands script, Move PipelineEmails to pipeline_emails script)
Remove Upload MultiQC as a dependency for duty_csv – addresses issue where duty_csv fails due to the cyber attack meaning that MultiQC reports cannot be uploaded to the genomics server i.e. upload multiqc fails
Sort fastqs before validation so it is easier to see how far through we are
Remove requirement to specify command line auth token for upload runfolder script – the script will only ever be run from the workstation, where the auth key file resides in the same location, so this makes it more user friendly
Improve md5sum checking behaviour – If the runfolder requires no checksum checking (i.e. is a MiSeq), or the runfolder requires checksum checking and the checksum file exists: Check sequencing is complete If the sequencer requires no checksum checking (is a MiSeq) or the integrity check passes (checksum file exists and either has been checked before and contains the checksum success message, or has not been checked before in which case the check is carried out) If the above criteria are satisfied, processing continues
Comment out addition of upload_multiqc command to dx run commands script
Update requirements file to ensure it captures all dependencies
Change missing fastqs logger message to warning
Incorporate workstation cleaner, with the addition of checking whether the runfolder is a development runfolder to prevent any deletion of dev runs
Fix LRPCR run not being identified as custom panels runtype by duty_csv, by adding the sample_prefix to the project name suffix

Assets 2

12 Jun 15:31

RachelDuffin

v45.0.0

caefc5b

v45.0.0

Major overhaul:

Refactor
Prevent git_tag() masking derivative commit
Remove email sending redundancy
Reduce complexity of AdLogger and add logging to all modules
Move scripts into own modules
Improve naming of variables
Apps and workflows specified by ID instead of strings – strings obtained by dx describe where necessary
Remove obsolete scripts (/scripts subdir)
Update for compatibility with Python 3
Add license
Add Pytest test suite and test data for tests
Add GitHub actions testing for Pytest test suite and flake8 formatting
Remove obsolete WES trio pan number - Pan3174
Addition of SensitiveFormatter to logging
Addition of more extensive logging
Change paths and naming of logfiles
Addition of a toolbox module containing functions that are used across multiple modules (includes new RunfolderObject() class which stores all runfolder attributes)
Improved documentation – docstrings, addition of readmes to each module
Improve readability of configuration file – use of dictionaries and per-module config classes
Move panel config to a separate file and improved layout
Incorporation of an email template and CSS style
Addition of typing
Incorporation of seglh-naming library via samplesheet_validator library
Moving cluster density calculation from setoff workflows to demultiplex script
Addition of script-level logfile that records decisions for which runs to process, and runfolder-level log files that record runfolder-level logs
Remove obsolete function excluding MiSeq created fastqs
Addition of backup runfolder script to the repository and ability to run as either a module import or on the command line
Addition of job name string to allow specifying names for test folders
Test folders are named 003_ in DNAnexus and shared with all binfx users with admin access
Log messages run in test mode contain a TEST_MODE flag, and in Pytest mode contain a PYTEST_TESTS flag
Standardise flags used in logging so that ERROR is the only thing that we are looking for to pick up
Incorporate the correct dockerised bcl2fastq build
Update duty_csv to v1.3.0, add qiagen_upload v1.0.0 app for TSO runs
Split config up and move log messages into a log config file
Removal of congenica_upload script to simplify generation of these commands
Setoff_workflows script now checks that the expected fastqs are present against the samples in the SampleSheet, and that the expected samples in the SampleSheet are present in the BaseCalls dir, and that the undetermined fastqs are present. If expected sample fastqs are missing, it logs an error and excludes those fastqs from the run processing, sending out an error alert
Move upload_runfolder logs to logs directory from DNAnexus_upload_started.txt file
Add support for dev runs demultiplexing.py, so that the dev run is identified by the presence of the dev pan number in the SampleSheet by the SampleSheet validator, and the bcl2fastqlog file is added once the run is finished to prevent further processing by the scripts, and a warning message is sent out which is picked up by rapid7 to alert that the run needs manual processing
Add command line support for dev runs – if runfolder name is provided on the command line and is a dev run, SampleSheet checks are bypassed
Remove bcl2fastq log checking function – this is not required as the success or failure of bcl2fastq can be assessed by the script using the returncode
Update runfolder name to append runtype to end of runfolder name for custom panels and WES runs (makes it easier to see what run it is e.g. in case of LRPCR)
Grant seglh_read org access to all uploaded projects
Add class and package diagrams
Specify v2 instances for PIPE workflow for BWA, Picard, GATK, filter_vcf, Sambamba
Increase FH GATK instance type to mem3_ssd1_v2_x16
Enable setoff workflows script to handle missing fastqs
Validate fastqs after demultiplexing using gzip –test. If any invalid fastqs exist, removes bcl2fastq log file to re-run demultiplexing
Add sample names being processed to samples being processed email
Introduce bash variables to store project name and ID in dx run scripts
Add tagging to uploaded files to allow for correct counting, and error message when expected number of uploaded files does not match actual number of uploaded files
Addition of a sleep command to each Qiagen upload command
Remove bcl2fastq log upon demultiplex fastq validation fail to allow for re-attempt at demultiplexing
Update CNV calling inputs for R134 (additional genes), R79 and R90 (fix single exon issue). Update readcount bed files and panel of normal files for VCP1 and VCP3
Add settings.json
Add demultiplex success and fail messages
Addition of a samplesheet check flag file that prevents re-checking a samplesheet that has already been checked by the script but failed the checks

Assets 2

19 Dec 16:25

natashapinto

v44.8.2

2d9b74a

v.44.8.2

Updated Exomedepth apps and normal_readcount files
VCP3 exomedepth changes

VCP3 CNV calling BED files
fixed small error in upload script

Assets 2

30 Oct 13:40

rebeccahaines1

v44.8.1

86c6e99

v44.8.1

Bug fix- add new line when creating congenica run commands file

Assets 2

23 Oct 10:26

rebeccahaines1

v44.8.0

c78c5eb

v44.8.0

v44.8.0 incorporates the following:

The script now splits the TSO samplesheets and runs the pipeline multiple times, once per resulting split samplesheet
Updated TSO app (v1.6.0) (AUD1352) which contains to support the app being run multiple times for the same run, and to output files in a useful way for downstream processes
TSO post-run processing commands are now written to a separate bash script. This is because the --wait flag cannot be used to delay the running of the commands for the downstream apps when running multiple instances of the TSO app to process a single run
Updated duty CSV app (v1.2.0) (AUD1349) that has been updated to function with the added exome depth PDF output and the altered TSO output format
Add new pan numbers: Pan5186 and Pan5185 - APC Associated Polyposis, Pan5180 - development run (stops warning messages)
Amend scripts so that samplesheet checks do not run for runs containing samples with the development pan number
Incorporate new exome depth app which performs CNV calling using ExomeDepth - currently only running for VCP1 and VCP2 samples (not VCP3)
Remove no longer required pan numbers: Pan4127 (VCP2 Viapath R209 (colorectal) and Pan4818 (VCP2 STG R209 colorectal) as R209 has been removed from the test directory. Pan4044 (STG VCP1), Pan4042 (STG VCP2 BRCA), Pan4049 (STG VCP2 CrCa), Pan4043 (STG VCP3) which were generic pan numbers we used to use for StG but have been separated into individual Pan numbers

Assets 2

09 Aug 13:42

RachelDuffin

v44.7.0

becc65c

v44.7.0

v44.7.0 incorporates the following changes:

Update TSO500 coverage BED file to Pan5130
Fix TSO coverage report output folders to output to a folder per Pan number
Incorporate new MultiQC dnanexus app v1.18.0 to all pipelines (contains new MultiQC plugin for coverage that adds a coverage table for all samples with sambamba chanjo gene_level coverage files)
Update TSO500 dependency so that MultiQC depends on sambamba chanjo jobs (except NTC samples)
Switch to a dockerised version of bcl2fastq2
Fix email function - add correct email server username back into the script
Update incorrect RPKM VCP3 pan number (Pan3974 should be Pan4362)
Remove obsolete MokaCAN pipeline
Add new Pan numbers for R444.1 and R444.2

Assets 2

24 Apr 13:33

RachelDuffin

v44.6.0

af09c1e

v44.6.0

This update incorporates the following changes:

Re-add peddy to multiqc depends list
Fix order of app dependency for TSO and Custom Panels pipelines (including addition of an extra dependency list for custom panels to stop multiqc depending on RPKM)
Add updated duty_csv app version
Remove non-required panel argument to fastqc command creation function
Exclude NTC sambamba job from depends list for TSO samples
Only add to depends_list if JOBID exists from the command for TSO fastqc sompy and sambamba
Add extra log command for writing dx run commands to file
Add support for R430 test indication (prostate panel) on VCP2
Update VCP2 variant calling, coverage and RPKM bed files
Add --priority flag to dx run commands that previously didn't have it
Specify dnanexus v2 instance types for peddy, multiqc, upload multiqc, RPKM and congenica upload commands
Increase timeout time on R134 runs from 6 to 12 hours

Assets 2

29 Mar 14:21

RachelDuffin

v44.5.0

a784447

v44.5.0

The addition of the duty_csv app to the end of all workflows. This consists of the addition of a new function to create the dx run command in the same way as for multiqc, and addition of config variables containing the inputs
Alteration to the way the TSO500 pipeline is set off so that it no longer requires use of the output parser app. This facilitates easier updating of the pipeline, and brings it closer to the set up of the other pipelines, with set off using the dx run command bash script which contains all run commands, as opposed to the commands being split between the workstation and DNAnexus
Addition of the --wait flag to the tso docker run command, to delay downstream tasks until all output files have been created
Update the version of fastqc used for ArcherDX and TSO samples
Update ADX and TSO pipeline IDs

Assets 2

27 Feb 10:57

RachelDuffin

v44.4.0

f8ee493

v44.4.0 - minor release

This release includes the following changes:

Update MokaPIPE workflow from version 2.17 to 2.18. This includes udpate of FastQC v1.3 → v1.4 (update fastqc version from v0.11.3 to v0.11.9 and dockerise), update of Picard v1.1 → v1.2 (Updated versions of samtools and picard, and made removal of chr in interval file optional), update of Filter_vcf_with_bedfile v1.0 → v1.1 (Add skip flag), and update of polyedge v1.0.0 → v1.1.0 (now outputs pdf, html and csv. Remove MSH2 variant hard coding)
Update TSO500 app to v1.5.1
Update Multiqc app to v1.17.0
Add new Pan numbers for TSO500 and ArcherDx to support dry lab work
Increased instance size for the MultiQC app

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: moka-guys/automate_demultiplex

V45.2.0

v45.1.0

v45.0.0

v.44.8.2

v44.8.1

v44.8.0

v44.7.0

v44.6.0

v44.5.0

v44.4.0 - minor release