Skip to content

Commit

Permalink
Merge branch 'main' into pr/stevekm/31
Browse files Browse the repository at this point in the history
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
  • Loading branch information
bentsherman committed Feb 6, 2025
2 parents d4940f6 + f2d0240 commit d3705d4
Show file tree
Hide file tree
Showing 13 changed files with 1,365 additions and 80 deletions.
33 changes: 16 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ Nextflow plugin to render provenance reports for pipeline runs. Now supporting [

## Getting Started

The `nf-prov` plugin requires Nextflow version `23.04.0` or later.

To enable and configure `nf-prov`, include the following snippet to your Nextflow config and update as needed.

```groovy
Expand All @@ -24,7 +22,7 @@ prov {
}
```

Finally, run your Nextflow pipeline. You do not need to modify your pipeline script in order to use the `nf-prov` plugin. The plugin will automatically generate a JSON file with provenance information.
Finally, run your Nextflow pipeline. You do not need to modify your pipeline script in order to use the `nf-prov` plugin. The plugin will automatically produce the specified provenance reports at the end of the workflow run.

## Configuration

Expand All @@ -38,18 +36,18 @@ Create the provenance report (default: `true` if plugin is loaded).

`prov.formats`

*New in version 1.2.0*

Configuration scope for the desired output formats. The following formats are available:

- `bco`: Render a [BioCompute Object](https://biocomputeobject.org/). Supports the `file` and `overwrite` options.

Visit the [BCO User Guide](https://docs.biocomputeobject.org/user_guide/) to learn more about this format and how to extend it with information that isn't available to Nextflow.
- `bco`: Render a [BioCompute Object](https://biocomputeobject.org/). Supports the `file` and `overwrite` options. See [BCO.md](docs/BCO.md) for more information about the additional config options for BCO.

- `dag`: Render the task graph as a Mermaid diagram embedded in an HTML document. Supports the `file` and `overwrite` options.

- `legacy`: Render the legacy format originally defined in this plugin (default). Supports the `file` and `overwrite` options.

*New in version 1.4.0*

- `wrroc`: Render a [Workflow Run RO-Crate](https://www.researchobject.org/workflow-run-crate/). Includes all three profiles (Process, Workflow, and Provenance). See [WRROC.md](docs/WRROC.md) for more information about the additional config options for WRROC.

Any number of formats can be specified, for example:

```groovy
Expand All @@ -67,6 +65,8 @@ prov {
}
```

See [nextflow.config](./nextflow.config) for a full example of each provenance format.

`prov.patterns`

List of file patterns to include in the provenance report, from the set of published files. By default, all published files are included.
Expand Down Expand Up @@ -123,16 +123,15 @@ Following these step to package, upload and publish the plugin:

2. Update the `Plugin-Version` field in the following file with the release version:

```bash
plugins/nf-prov/src/resources/META-INF/MANIFEST.MF
```
```bash
plugins/nf-prov/src/resources/META-INF/MANIFEST.MF
```

3. Run the following command to package and upload the plugin in the GitHub project releases page:

```bash
./gradlew :plugins:nf-prov:upload
```

4. Create a pull request against the [nextflow-io/plugins](https://github.com/nextflow-io/plugins/blob/main/plugins.json)
project to make the plugin public accessible to Nextflow app.
```bash
./gradlew :plugins:nf-prov:upload
```

4. Create a pull request against the [nextflow-io/plugins](https://github.com/nextflow-io/plugins/blob/main/plugins.json)
project to make the plugin public accessible to Nextflow app.
180 changes: 180 additions & 0 deletions docs/BCO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Additional BCO configuration

*New in version 1.3.0*

The `bco` format supports additional "pass-through" options for certain BCO fields. These fields cannot be inferred automatically from a pipeline or run, and so must be entered through the config. External systems can use these config options to inject fields automatically.

The following config options are supported:

- `prov.formats.bco.provenance_domain.review`
- `prov.formats.bco.provenance_domain.derived_from`
- `prov.formats.bco.provenance_domain.obsolete_after`
- `prov.formats.bco.provenance_domain.embargo`
- `prov.formats.bco.usability_domain`
- `prov.formats.bco.description_domain.keywords`
- `prov.formats.bco.description_domain.xref`
- `prov.formats.bco.execution_domain.external_data_endpoints`
- `prov.formats.bco.execution_domain.environment_variables`

These options correspond exactly to fields in the BCO JSON schema. Refer to the [BCO User Guide](https://docs.biocomputeobject.org/user_guide/) for more information about these fields.

*NOTE: The `environment_variables` setting differs from the BCO standard in that it only specifies the variable names. Only the variables specified in this list will be populated in the BCO, if they are present in the execution environment.*

Here is an example config based on the BCO User Guide:

```groovy
prov {
formats {
bco {
provenance_domain {
review = [
[
"status": "approved",
"reviewer_comment": "Approved by GW staff. Waiting for approval from FDA Reviewer",
"date": "2017-11-12T12:30:48-0400",
"reviewer": [
"name": "Charles Hadley King",
"affiliation": "George Washington University",
"email": "hadley_king@gwu.edu",
"contribution": "curatedBy",
"orcid": "https://orcid.org/0000-0003-1409-4549"
]
],
[
"status": "approved",
"reviewer_comment": "The revised BCO looks fine",
"date": "2017-12-12T12:30:48-0400",
"reviewer": [
"name": "Eric Donaldson",
"affiliation": "FDA",
"email": "Eric.Donaldson@fda.hhs.gov",
"contribution": "curatedBy"
]
]
]
derived_from = 'https://example.com/BCO_948701/1.0'
obsolete_after = '2118-09-26T14:43:43-0400'
embargo = [
"start_time": "2000-09-26T14:43:43-0400",
"end_time": "2000-09-26T14:43:45-0400"
]
}
usability_domain = [
"Identify baseline single nucleotide polymorphisms (SNPs)[SO:0000694], (insertions)[SO:0000667], and (deletions)[SO:0000045] that correlate with reduced (ledipasvir)[pubchem.compound:67505836] antiviral drug efficacy in (Hepatitis C virus subtype 1)[taxonomy:31646]",
"Identify treatment emergent amino acid (substitutions)[SO:1000002] that correlate with antiviral drug treatment failure",
"Determine whether the treatment emergent amino acid (substitutions)[SO:1000002] identified correlate with treatment failure involving other drugs against the same virus",
"GitHub CWL example: https://github.com/mr-c/hive-cwl-examples/blob/master/workflow/hive-viral-mutation-detection.cwl#L20"
]
description_domain {
keywords = [
"HCV1a",
"Ledipasvir",
"antiviral resistance",
"SNP",
"amino acid substitutions"
]
xref = [
[
"namespace": "pubchem.compound",
"name": "PubChem-compound",
"ids": ["67505836"],
"access_time": "2018-13-02T10:15-05:00"
],
[
"namespace": "pubmed",
"name": "PubMed",
"ids": ["26508693"],
"access_time": "2018-13-02T10:15-05:00"
],
[
"namespace": "so",
"name": "Sequence Ontology",
"ids": ["SO:000002", "SO:0000694", "SO:0000667", "SO:0000045"],
"access_time": "2018-13-02T10:15-05:00"
],
[
"namespace": "taxonomy",
"name": "Taxonomy",
"ids": ["31646"],
"access_time": "2018-13-02T10:15-05:00"
]
]
}
execution_domain {
external_data_endpoints = [
[
"url": "protocol://domain:port/application/path",
"name": "generic name"
],
[
"url": "ftp://data.example.com:21/",
"name": "access to ftp server"
],
[
"url": "http://eutils.ncbi.nlm.nih.gov/entrez/eutils",
"name": "access to e-utils web service"
]
]
environment_variables = ["HOSTTYPE", "EDITOR"]
}
}
}
}
```

Alternatively, you can use params to make it easier for an external system:

```groovy
prov {
formats {
bco {
provenance_domain {
review = params.bco_provenance_domain_review
derived_from = params.bco_provenance_domain_derived_from
obsolete_after = params.bco_provenance_domain_obsolete_after
embargo = params.bco_provenance_domain_embargo
}
usability_domain = params.bco_usability_domain
description_domain {
keywords = params.bco_description_domain_keywords
xref = params.bco_description_domain_xref
}
execution_domain {
external_data_endpoints = params.bco_execution_domain_external_data_endpoints
environment_variables = params.bco_execution_domain_environment_variables
}
}
}
}
```

This way, the pass-through options can be provided as JSON in a [params file](https://nextflow.io/docs/latest/reference/cli.html#run):

```jsonc
{
"bco_provenance_domain_review": [
// ...
],
"derived_from": "...",
"obsolete_after": "...",
"embargo": {
"start_time": "...",
"end_time": "..."
},
"bco_usability_domain": [
// ...
],
"bco_description_domain_keywords": [
// ...
],
"bco_description_domain_xref": [
// ...
],
"bco_execution_domain_external_data_endpoints": [
// ...
],
"bco_execution_domain_environment_variables": [
// ...
]
}
```
47 changes: 47 additions & 0 deletions docs/WRROC.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Additional WRROC configuration

*New in version 1.4.0*

The `wrroc` format supports additional options to configure certain aspects of the Workflow Run RO-Crate. These fields cannot be inferred automatically from the pipeline or the run, and so must be entered through the config.

The following config options are supported:

- `prov.formats.wrroc.agent.contactType`
- `prov.formats.wrroc.agent.email`
- `prov.formats.wrroc.agent.name`
- `prov.formats.wrroc.agent.orcid`
- `prov.formats.wrroc.agent.phone`
- `prov.formats.wrroc.agent.ror`
- `prov.formats.wrroc.organization.contactType`
- `prov.formats.wrroc.organization.email`
- `prov.formats.wrroc.organization.name`
- `prov.formats.wrroc.organization.phone`
- `prov.formats.wrroc.organization.ror`
- `prov.formats.wrroc.license`
- `prov.formats.wrroc.publisher`

Refer to the [WRROC User Guide](https://www.researchobject.org/workflow-run-crate/) for more information about the associated RO-Crate entities.

Here is an example config:

```groovy
prov {
formats {
wrroc {
agent {
name = "John Doe"
orcid = "https://orcid.org/0000-0000-0000-0000"
email = "john.doe@example.org"
phone = "(0)89-99998 000"
contactType = "Researcher"
}
organization {
name = "University of XYZ"
ror = "https://ror.org/000000000"
}
license = "https://spdx.org/licenses/MIT"
publisher = "https://ror.org/000000000"
}
}
}
```
18 changes: 7 additions & 11 deletions plugins/nf-prov/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -56,21 +56,17 @@ sourceSets {

dependencies {
// This dependency is exported to consumers, that is to say found on their compile classpath.
compileOnly 'io.nextflow:nextflow:23.04.0'
compileOnly 'io.nextflow:nextflow:24.10.0'
compileOnly 'org.slf4j:slf4j-api:1.7.10'
compileOnly 'org.pf4j:pf4j:3.4.1'
// add here plugins depepencies
compileOnly 'org.pf4j:pf4j:3.12.0'

// test configuration
testImplementation "org.codehaus.groovy:groovy:3.0.8"
testImplementation "org.codehaus.groovy:groovy-nio:3.0.8"
testImplementation 'io.nextflow:nextflow:23.04.0'
testImplementation ("org.codehaus.groovy:groovy-test:3.0.8") { exclude group: 'org.codehaus.groovy' }
testImplementation 'io.nextflow:nextflow:24.10.0'
testImplementation ("cglib:cglib-nodep:3.3.0")
testImplementation ("org.objenesis:objenesis:3.1")
testImplementation ("org.spockframework:spock-core:2.0-M3-groovy-3.0") { exclude group: 'org.codehaus.groovy'; exclude group: 'net.bytebuddy' }
testImplementation ('org.spockframework:spock-junit4:2.0-M3-groovy-3.0') { exclude group: 'org.codehaus.groovy'; exclude group: 'net.bytebuddy' }
testImplementation ('com.google.jimfs:jimfs:1.1')
testImplementation ("org.objenesis:objenesis:3.2")
testImplementation ("org.spockframework:spock-core:2.3-groovy-4.0") { exclude group: 'org.codehaus.groovy'; exclude group: 'net.bytebuddy' }
testImplementation ('org.spockframework:spock-junit4:2.3-groovy-4.0') { exclude group: 'org.codehaus.groovy'; exclude group: 'net.bytebuddy' }
testImplementation ('com.google.jimfs:jimfs:1.2')

// see https://docs.gradle.org/4.1/userguide/dependency_management.html#sec:module_replacement
modules {
Expand Down
Loading

0 comments on commit d3705d4

Please sign in to comment.