nf-core · luisas · Oct 1, 2024 · Sep 25, 2024 · Sep 25, 2024 · Sep 25, 2024
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -24,6 +24,10 @@ If you'd like to write some code for nf-core/multiplesequencealign, the standard
 
 If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/).
 
+:::note
+! There is an extended documentation for adding specific module types into this pipeline at [extending](../docs/extending.md).
+:::
+
 ## Tests
 
 You have the option to test your changes locally by running the pipeline. For receiving warnings about process selectors and other `debug` information, it is recommended to use the debug profile. Execute all the tests with the following command:

diff --git a/.nf-core.yml b/.nf-core.yml
@@ -2,4 +2,7 @@ repository_type: pipeline
 nf_core_version: "2.14.1"
 lint:
   multiqc_config: False
-  files_exist: conf/igenomes.config
+  files_exist:
+    - conf/igenomes.config
+  files_unchanged:
+    - .github/CONTRIBUTING.md
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,7 +3,7 @@
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## 1.0.0 - Somorrostro
+## [1.0.0 - Somorrostro](https://github.com/nf-core/multiplesequencealign/releases/tag/1.0.0)
 
 Somorrostro is a beach in Barcelona.
 

diff --git a/CITATIONS.md b/CITATIONS.md
@@ -40,6 +40,10 @@
 
   > Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002 Jul 15;30(14):3059-66. doi: 10.1093/nar/gkf436. PMID: 12136088; PMCID: PMC135756.
 
+- [MAGUS](https://pubmed.ncbi.nlm.nih.gov/33252662/)
+
+  > Smirnov V, Warnow T. MAGUS: Multiple sequence Alignment using Graph clUStering. Bioinformatics. 2021 Jul 19;37(12):1666-1672. doi: 10.1093/bioinformatics/btaa992. PMID: 33252662; PMCID: PMC8289385.
+
 - [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
 
   > Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.

diff --git a/README.md b/README.md
@@ -21,9 +21,13 @@
 
 **nf-core/multiplesequencealign** is a pipeline to deploy and systematically evaluate Multiple Sequence Alignment (MSA) methods.
 
+The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!
+
+On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the [nf-core website](https://nf-co.re/proteinfold/results).
+
 ![Alt text](docs/images/nf-core-msa_metro_map.png?raw=true "nf-core-msa metro map")
 
-In a nutshell, the pipeline performs the following steps:
+The pipeline performs the following steps:
 
 1. **Input files summary**: (Optional) computation of summary statistics on the input files, such as the average sequence similarity across the input sequences, their length, plddt extraction if available.
 
@@ -34,8 +38,9 @@ In a nutshell, the pipeline performs the following steps:
 
 ## Usage
 
-> [!NOTE]
-> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
+:::note
+If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set-up Nextflow. Make sure to [test your setup](https://nf-co.re/docs/usage/introduction#how-to-run-a-pipeline) with `-profile test` before running the workflow on actual data.
+:::
 
 #### 1. SAMPLESHEET
 
@@ -52,8 +57,9 @@ toxin,toxin.fa,toxin-ref.fa,toxin_structures,toxin_template.txt
 
 Each row represents a set of sequences (in this case the seatoxin and toxin protein families) to be aligned and the associated (if available) reference alignments and dependency files (this can be anything from protein structure or any other information you would want to use in your favourite MSA tool).
 
-> [!NOTE]
-> The only required input is the id column and either fasta or dependencies.
+:::note
+The only required input is the id column and either fasta or dependencies.
+:::
 
 #### 2. TOOLSHEET
 
@@ -72,8 +78,9 @@ FAMSA, -gt upgma -medoidtree, FAMSA,
 FAMSA,,REGRESSIVE,
 ```
 
-> [!NOTE]
-> The only required input is aligner.
+:::note
+The only required input is aligner.
+:::
 
 #### 3. RUN THE PIPELINE
 
@@ -87,9 +94,10 @@ nextflow run nf-core/multiplesequencealign \
    --outdir outdir
 ```
 
-> [!WARNING]
-> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
-> see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
+:::warning
+Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_;
+see [docs](https://nf-co.re/usage/configuration#custom-configuration-files).
+:::
 
 For more details and further functionality, please refer to the [usage documentation](https://nf-co.re/multiplesequencealign/usage) and the [parameter documentation](https://nf-co.re/multiplesequencealign/parameters).
 

diff --git a/assets/adaptivecard.json b/assets/adaptivecard.json
@@ -17,7 +17,7 @@
                         "size": "Large",
                         "weight": "Bolder",
                         "color": "<% if (success) { %>Good<% } else { %>Attention<%} %>",
-                        "text": "nf-core/multiplesequencealign v${version} - ${runName}",
+                        "text": "nf-core/multiplesequencealign ${version} - ${runName}",
                         "wrap": true
                     },
                     {

diff --git a/assets/multiqc_config.yml b/assets/multiqc_config.yml
@@ -1,7 +1,7 @@
 report_comment: >
   This report has been generated by the <a href="https://github.com/nf-core/multiplesequencealign/releases/tag/1.0.0" target="_blank">nf-core/multiplesequencealign</a>
   analysis pipeline. For information about how to interpret these results, please see the
-  <a href="https://nf-co.re/multiplesequencealign/0.1.0dev/docs/output" target="_blank">documentation</a>.
+  <a href="https://nf-co.re/multiplesequencealign/1.0.0/docs/output" target="_blank">documentation</a>.
 report_section_order:
   "nf-core-multiplesequencealign-methods-description":
     order: -1000

diff --git a/assets/schema_tools.json b/assets/schema_tools.json
@@ -9,6 +9,7 @@
         "properties": {
             "tree": {
                 "type": "string",
+                "pattern": "^\\S+$",
                 "errorMessage": "tree name cannot contain spaces",
                 "meta": ["tree"]
             },
@@ -19,6 +20,7 @@
             "aligner": {
                 "type": "string",
                 "meta": ["aligner"],
+                "pattern": "^\\S+$",
                 "errorMessage": "align name must be provided and cannot contain spaces"
             },
             "args_aligner": {

diff --git a/docs/extending.md b/docs/extending.md
@@ -9,6 +9,8 @@ This pipeline is extensible, allowing the incorporation of new methods for assem
 - The [nf-test documentation](https://code.askimed.com/nf-test/docs/getting-started/)
 - The [nf-core slack](https://nf-co.re/join), particularly the [multiplesequencealign channel](https://nfcore.slack.com/archives/C05LZ7EAYGK). Feel free to reach out!
 
+Please also check the [contribution guidelines](../.github/CONTRIBUTING.md).
+
 ## Adding an aligner
 
 These steps will guide you to include a new MSA tool into the pipeline. Once done, this will allow you to systematically deploy and benchmark your tool against all others included in the pipeline. You are also welcome to contribute back to the pipeline if you wish.

diff --git a/docs/images/nf-core-msa_metro_map.png b/docs/images/nf-core-msa_metro_map.png
diff --git a/docs/usage.md b/docs/usage.md
@@ -90,8 +90,9 @@ The provided structures (see samplesheet) are used to evaluate the quality of th
 Finally, a summary table with all the computed statistics and evaluations is reported in MultiQC (skip by using `--skip_multiqc`).
 Moreover, a Shiny app is generated with interactive summary plots (skip with `--skip_shiny`).
 
-> [!WARNING]
-> You will need to have [Shiny](https://shiny.posit.co/py/) installed to run it! See [output documentation](https://nf-co.re/multiplesequencealign/output) for more info.
+:::warning
+You will need to have [Shiny](https://shiny.posit.co/py/) installed to run it! See [output documentation](https://nf-co.re/multiplesequencealign/output) for more info.
+:::
 
 ## Samplesheet input
 
@@ -114,8 +115,9 @@ Each row represents a set of sequences (in this case the seatoxin and toxin prot
 | `dependencies` | Required (At least one of fasta or dependencies must be provided). Full path to the folder that contains the dependency files (e.g. protein structures) for the sequences to be aligned. Currently, it is used for structural aligners and structure-based evaluation steps. It can be left empty.                                                                                                                                                                                                                                                                                                                                                                                                 |
 | `template`     | Optional. Files that define the mapping between the input sequence and the dependency files (e.g. protein structures) to be used. Used by 3D-Coffee. If not specified, they will be automatically generated assuming that the sequence name provided in the fasta is the same as the file name of the corresponding PDB file. E.g. if you set (default) the parameter templates_suffix to .pdb, then: ">MyProteinName" in the fasta file and "MyProteinName.pdb" for the corresponding protein structure. For more information on how to generate a template file manually, please look at the T-Coffee [documentation](https://tcoffee.readthedocs.io/en/latest/tcoffee_main_documentation.html). |
 
-> [!NOTE]
-> You can have some samples with dependencies and/or references and some without. The pipeline will run the modules requiring dependencies/references only on the samples for which you have provided the required information and the others will be just skipped.
+:::note
+You can have some samples with dependencies and/or references and some without. The pipeline will run the modules requiring dependencies/references only on the samples for which you have provided the required information and the others will be just skipped.
+:::
 
 ## Toolsheet input
 
@@ -132,11 +134,13 @@ FAMSA, -gt upgma -medoidtree, FAMSA,
 FAMSA,,REGRESSIVE,
 ```
 
-> [!NOTE]
-> Each of the trees and aligners are available as standalones. You can leave `args_tree` and `args_aligner` empty if you are cool with the default settings of each method. Alternatively, you can leave `args_tree` empty to use the default guide tree with each aligner.
+:::note
+Each of the trees and aligners are available as standalones. You can leave `args_tree` and `args_aligner` empty if you are cool with the default settings of each method. Alternatively, you can leave `args_tree` empty to use the default guide tree with each aligner.
+:::
 
-> [!NOTE]
-> use the exact spelling as listed above in [align](#3-align) and [guide trees](#2-guide-trees)!
+:::note
+use the exact spelling as listed above in [align](#3-align) and [guide trees](#2-guide-trees)!
+:::
 
 `tree` is the tool used to build the tree (optional).
 
@@ -176,8 +180,9 @@ If you wish to repeatedly use the same parameters for multiple runs, rather than
 
 Pipeline settings can be provided in a `yaml` or `json` file via `-params-file <file>`.
 
-> [!WARNING]
-> Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process >resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or >module arguments (args).
+:::warning
+Do not use `-c <file>` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args).
+:::
 
 The above pipeline run specified with a params file in yaml format:
 
@@ -214,22 +219,25 @@ This version number will be logged in reports when you run the pipeline, so that
 
 To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter.
 
-> [!TIP]
-> If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, >nor institutional specific profiles.
+:::tip
+If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, >nor institutional specific profiles.
+:::
 
 ## Core Nextflow arguments
 
-> [!NOTE]
-> These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
+:::tip
+These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen).
+:::
 
 ### `-profile`
 
 Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments.
 
 Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below.
 
-> [!INFO]
-> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
+:::info
+We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported.
+:::
 
 The pipeline also dynamically loads configurations from [https://github.com/nf-core/configs](https://github.com/nf-core/configs) when it runs, making multiple config profiles for various institutional clusters available at run time. For more information and to see if your system is available in these configs please see the [nf-core/configs documentation](https://github.com/nf-core/configs#documentation).
 

diff --git a/modules/nf-core/csvtk/join/csvtk-join.diff b/modules/nf-core/csvtk/join/csvtk-join.diff
diff --git a/modules/nf-core/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py b/modules/nf-core/custom/dumpsoftwareversions/templates/dumpsoftwareversions.py
diff --git a/nextflow.config b/nextflow.config
@@ -6,10 +6,6 @@
 ----------------------------------------------------------------------------------------
 */
 
-plugins {
-    id 'nf-validation@0.3.1'
-}
-
 // Global default params, used in configs
 params {
 
@@ -217,7 +213,7 @@ singularity.registry = 'quay.io'
 
 // Nextflow plugins
 plugins {
-    id 'nf-validation@1.1.3' // Validation of pipeline parameters and creation of an input channel from a sample sheet
+    id 'nf-validation@1.1.4' // Validation of pipeline parameters and creation of an input channel from a sample sheet
 }