v45.0.0
Major overhaul:
- Refactor
- Prevent git_tag() masking derivative commit
- Remove email sending redundancy
- Reduce complexity of AdLogger and add logging to all modules
- Move scripts into own modules
- Improve naming of variables
- Apps and workflows specified by ID instead of strings – strings obtained by dx describe where necessary
- Remove obsolete scripts (/scripts subdir)
- Update for compatibility with Python 3
- Add license
- Add Pytest test suite and test data for tests
- Add GitHub actions testing for Pytest test suite and flake8 formatting
- Remove obsolete WES trio pan number - Pan3174
- Addition of SensitiveFormatter to logging
- Addition of more extensive logging
- Change paths and naming of logfiles
- Addition of a toolbox module containing functions that are used across multiple modules (includes new RunfolderObject() class which stores all runfolder attributes)
- Improved documentation – docstrings, addition of readmes to each module
- Improve readability of configuration file – use of dictionaries and per-module config classes
- Move panel config to a separate file and improved layout
- Incorporation of an email template and CSS style
- Addition of typing
- Incorporation of seglh-naming library via samplesheet_validator library
- Moving cluster density calculation from setoff workflows to demultiplex script
- Addition of script-level logfile that records decisions for which runs to process, and runfolder-level log files that record runfolder-level logs
- Remove obsolete function excluding MiSeq created fastqs
- Addition of backup runfolder script to the repository and ability to run as either a module import or on the command line
- Addition of job name string to allow specifying names for test folders
- Test folders are named 003_ in DNAnexus and shared with all binfx users with admin access
- Log messages run in test mode contain a TEST_MODE flag, and in Pytest mode contain a PYTEST_TESTS flag
- Standardise flags used in logging so that ERROR is the only thing that we are looking for to pick up
- Incorporate the correct dockerised bcl2fastq build
- Update duty_csv to v1.3.0, add qiagen_upload v1.0.0 app for TSO runs
- Split config up and move log messages into a log config file
- Removal of congenica_upload script to simplify generation of these commands
- Setoff_workflows script now checks that the expected fastqs are present against the samples in the SampleSheet, and that the expected samples in the SampleSheet are present in the BaseCalls dir, and that the undetermined fastqs are present. If expected sample fastqs are missing, it logs an error and excludes those fastqs from the run processing, sending out an error alert
- Move upload_runfolder logs to logs directory from DNAnexus_upload_started.txt file
- Add support for dev runs demultiplexing.py, so that the dev run is identified by the presence of the dev pan number in the SampleSheet by the SampleSheet validator, and the bcl2fastqlog file is added once the run is finished to prevent further processing by the scripts, and a warning message is sent out which is picked up by rapid7 to alert that the run needs manual processing
- Add command line support for dev runs – if runfolder name is provided on the command line and is a dev run, SampleSheet checks are bypassed
- Remove bcl2fastq log checking function – this is not required as the success or failure of bcl2fastq can be assessed by the script using the returncode
- Update runfolder name to append runtype to end of runfolder name for custom panels and WES runs (makes it easier to see what run it is e.g. in case of LRPCR)
- Grant seglh_read org access to all uploaded projects
- Add class and package diagrams
- Specify v2 instances for PIPE workflow for BWA, Picard, GATK, filter_vcf, Sambamba
- Increase FH GATK instance type to mem3_ssd1_v2_x16
- Enable setoff workflows script to handle missing fastqs
- Validate fastqs after demultiplexing using gzip –test. If any invalid fastqs exist, removes bcl2fastq log file to re-run demultiplexing
- Add sample names being processed to samples being processed email
- Introduce bash variables to store project name and ID in dx run scripts
- Add tagging to uploaded files to allow for correct counting, and error message when expected number of uploaded files does not match actual number of uploaded files
- Addition of a sleep command to each Qiagen upload command
- Remove bcl2fastq log upon demultiplex fastq validation fail to allow for re-attempt at demultiplexing
- Update CNV calling inputs for R134 (additional genes), R79 and R90 (fix single exon issue). Update readcount bed files and panel of normal files for VCP1 and VCP3
- Add settings.json
- Add demultiplex success and fail messages
- Addition of a samplesheet check flag file that prevents re-checking a samplesheet that has already been checked by the script but failed the checks