plant-food-research-open/genepal: Changelog

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

v0.6.0 - [20-Dec-2024]

'Added'

Added cDNA and CDS outputs to <OUTPUT_DIR>/annotations/ directory #118
Added parameter add_attrs_to_proteins_cds_fastas
Added parameter filter_genes_by_aa_length with default set to 24 which allows removal of genes with ORFs shorter than 24 #125

`Fixed`

Fixed an issue where TSEBRA failed because LIFTOFF lifted non-protein coding genes #121
Switched branch name from master to main in the GHA CIs
Fixed an issue in genepal_report.Rmd which caused the pangene matrix plot to fail when the number of clusters exceeded 65536 #124
Fixed an issue where GENEPALREPORT process failed due to OOM kill signal from SLURM #123
Fixed an issue where Gff merge after liftoff failed when one of the Gff files did not contain any genes
Fixed an issue where gxf_fasta_agat_spaddintrons_spextractsequences crashed due to short introns #89

`Dependencies`

Nextflow!>=24.04.2
nf-schema@2.1.1

`Deprecated`

Removed parameter add_attrs_to_proteins_fasta

v0.5.0 - [21-Nov-2024]

`Added`

Added MultiQC #65
Updated nf-core template to 3.0.2 #66
Integrated nf-test into pipeline CI #68
Updated the flowchart #87
Added a large test dataset for the test_full profile #90
Now .gff.gz and .gff3.gz inputs are also allowed for the benchmark column in --input
Now removing liftoff genes with any intron shorted than 10bp #89
Now also removing rRNA and tRNA after liftoff as the downstream logic in the pipeline can not correctly handle these
Now skipping FastQC by default #98
Added an HTML report #44
Added content type as text/html for the MultiQC and genepal reports
Added sra-tools for RNASeq data download #102

`Fixed`

Now using ${meta.id}_trim as prefix for FASTQC files
Updated citations to include DOIs
Fixed a bug where FASTQ versions were not correctly captured
Now using the correct out channel from STAR_ALIGN. This bug was introduced by a module update during the development of this version #74
Fixed OrthoFinder results copy failure on AWS #108

`Dependencies`

Nextflow!>=24.04.2
nf-schema@2.1.1

`Deprecated`

Resource parameters have been removed: max_memory, max_cpus, max_time
Removed a number of unnecessary parameters: monochromeLogs, config_profile_contact, config_profile_url, validationFailUnrecognisedParams, validationLenientMode, validationSchemaIgnoreParams, validationShowHiddenParams, validate_params
Removed extra_fastp_args and replaced it with fastp_extra_args
Removed and replaced skip_fastp and skip_fastqc with fastp_skip and fastqc_skip #82

v0.4.0 - [04-Oct-2024]

`Added`

Added orthofinder_annotations param
Added FASTA_GFF_ORTHOFINDER sub-workflow
Added evaluation by BUSCO #41
Included common tax ids for eggnog mapper #27
Implemented hierarchical naming scheme: geneI.tJ, geneI.tJ.exonK, geneI.tJ.cdsK #19, #34
Now sorting list of bam and list of fastq before cat to avoid resume cache misses
Allowed BAM files for RNA evidence #3
Added GXF_FASTA_AGAT_SPADDINTRONS_SPEXTRACTSEQUENCES sub-workflow for splice type statistics #11
Changed orthofinder_annotations from FASTA/GFF to protein FASTA #43
Added param enforce_full_intron_support to turn on/off strict model purging by TSEBRA #21
Added param filter_liftoff_by_hints to evaluate liftoff models with TSEBRA to make sure they have the same level of evidence as BRAKER #28
Added a script to automatically check module version updates
Reduced BRAKER3 threads to 8 #55
Now the final annotations are stored in the annotations folder #53
Now a single fasta file can be directly specified for protein_evidence
eggnogmapper_db_dir is not a required parameter anymore
eggnogmapper_tax_scope is now set to 1 (root div) by default
Added a test profile based on public data
Added parameter add_attrs_to_proteins_fasta to enable/disable addition of decoded gff attributes to proteins fasta #58
Added a check for input assemblies. If an assembly is smaller than 1 MB (or 300KB in zipped format), the pipeline errors out before starting the downstream processes #47
Now REPEATMASKER GFF output is saved via CUSTOM_RMOUTTOGFF3 #54
Added benchmark column to the input sheet and used GFFCOMPARE to perform benchmarking #63
Added SEQKIT_RMDUP to detect duplicate sequence and wrap the fasta to 80 characters
Updated parameter section labels for annotation and post-annotation filtering #64
Updated modules and sub-workflows

`Fixed`

Fixed BRAKER spellings #36
Fixed liftoff failure when lifting off from a single reference #40
Added versions from GFF_STORE sub-workflows #33

`Dependencies`

NextFlow!>=23.04.4
nf-validation=1.1.3

`Deprecated`

Renamed external_protein_fastas param to protein_evidence
Renamed fastq param to rna_evidence
Renamed braker_allow_isoforms param to allow_isoforms
Moved liftoffID from gene level to mRNA/transcript level
Moved version_check.sh to .github/version_checks.sh
Removed dependency on https://github.com/kherronism/nf-modules.git for BRAKER3 and REPEATMASKER modules which are now installed from https://github.com/GallVp/nxf-components.git
Removed dependency on https://github.com/PlantandFoodResearch/nxf-modules.git
Now the final annotations are not stored in the final folder
Now BRAKER3 outputs are not saved by default #53 and saved under etc folder when enabled
Removed local profile. Local executor is the default when no executor is specified. Therefore, the local profile was not needed.
Removed CUSTOM_DUMPSOFTWAREVERSIONS
pipeline_info/software_versions.yml has been replaced with pipeline_info/genepal_software_mqc_versions.yml

v0.3.3 - [18-Jun-2024]

`Added`

Added a stub test to evaluate the case where an assembly is soft masked but has no annotations

`Fixed`

Fixed a bug where is_masked was ignored by the pipeline
Fixed a bug in param validation which allowed specification of braker_hints without braker_gff3

`Dependencies`

NextFlow!>=23.04.4
nf-validation=1.1.3

`Deprecated`

v0.3.2 - [13-May-2024]

`Added`

`Fixed`

Increased time limit for REPEATMODELER_REPEATMODELER to 5 days
Now removing comments from fasta file before feeding it to BRAKER added tests for the perl one liner
Fixed CHANGELOG version check failure in version_check.sh
Increased the SLURM job time limit to 14 days

`Dependencies`

NextFlow!>=23.04.4
nf-validation=1.1.3

`Deprecated`

v0.3.1 - [10-May-2024]

`Added`

`Fixed`

Increased time limit for REPEATMODELER_REPEATMODELER to 3 days, REPEATMASKER to 2 days, EDTA_EDTA to 7 days, BRAKER3 to 7 days and EGGNOGMAPPER to 1 day

`Dependencies`

NextFlow!>=23.04.4
nf-validation=1.1.3

`Deprecated`

v0.3.0 - [30-April-2024]

`Added`

Added changelog and semantic versioning
Changed license to MIT
Updated .editorconfig
Moved .literature to test/ branch
Renamed genepal_local to local_genepal
Renamed genepal_pfr to pfr_genepal
Added versioning checking
Updated github workflow to use pre-commit instead of prettier and editorconfig check
Added central singularity cache dir for pfr config
Added SORTMERNA_INDEX before SORTMERNA
Fixed sample contamination bug introduced by file.simpleName
Now using empty files for stub testing in CI
Now BRAKER can be skipped by including BRAKER outputs from previous runs in the target_assemblies param
Added gffcompare to merge liftoff annotations
Renamed samplesheet param to fastq
Now using assemblysheet in combination with nf-validation for assembly input
Added nextflow_schema.json
Now using nf-validation to validate fastqsheet provided by params.fastq
Moved manifest.config and reporting_defaults.config content to nextflow.config
Now using a txt file for params.external_protein_fastas
Now using nf-validation for params.liftoff_annotations
Now using nf-validation for all the parameters
Added PURGE_BRAKER_MODELS sub-workflow
Added GFF_EGGNOGMAPPER sub-workflow
Now using a custom version of GFFREAD which supports meta and fasta
Now using TSEBRA to purge models which do not have full intron support from BRAKER hints
Added params eggnogmapper_evalue and eggnogmapper_pident
Added PURGE_NOHIT_BRAKER_MODELS sub-workflow
Now merging BRAKER and liftoff models before running eggnogmapper
Added GFF_MERGE_CLEANUP sub-workflow
Now using description field to store notes and textual annotations in the gff files
Now using mRNA in place of transcript in gff files
Now eggnogmapper_purge_nohits is set to false by default
Added GFF_STORE sub workflow
external_protein_fastas and eggnogmapper_db_dir are not mandatory parameters
Added contributors
Add a document for the pipeline parameters
Updated pfr_genepal and pfr/profile.config
Now using local tests/stub files for GitHub CI
Now removing iso-forms left by TSEBRA using AGAT_SPFILTERFEATUREFROMKILLLIST
Added pyproject.toml
Now using PFAMs from eggnog if description is '-'

`Fixed`

Removed liftoff models with valid_ORF=False
Updated license text to include 'Copyright (c) 2024 The New Zealand Institute for Plant and Food Research Limited'

`Dependencies`

NextFlow!>=23.04.4
nf-validation=1.1.3

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

plant-food-research-open/genepal: Changelog

v0.6.0 - [20-Dec-2024]

'Added'

Fixed

Dependencies

Deprecated

v0.5.0 - [21-Nov-2024]

Added

Fixed

Dependencies

Deprecated

v0.4.0 - [04-Oct-2024]

Added

Fixed

Dependencies

Deprecated

v0.3.3 - [18-Jun-2024]

Added

Fixed

Dependencies

Deprecated

v0.3.2 - [13-May-2024]

Added

Fixed

Dependencies

Deprecated

v0.3.1 - [10-May-2024]

Added

Fixed

Dependencies

Deprecated

v0.3.0 - [30-April-2024]

Added

Fixed

Dependencies

Deprecated

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`

`Added`

`Fixed`

`Dependencies`

`Deprecated`