Writing nf-core modules and subworkflows
If you decide to upload a module to nf-core/modules
then this will ensure that it will become available to all nf-core pipelines, and to everyone within the Nextflow community! See modules/
for examples.
Writing a new module reference
Before you start
Please check that the module you wish to add isn’t already on nf-core/modules
:
- Use the
nf-core modules list
command - Check open pull requests
- Search open issues
If the module doesn’t exist on nf-core/modules
:
- Please create a new issue before adding it
- Set an appropriate subject for the issue e.g.
new module: fastqc
- Add yourself to the
Assignees
so we can track who is working on the module
New module workflow
We have implemented a number of commands in the nf-core/tools
package to make it incredibly easy for you to create and contribute your own modules to nf-core/modules.
- Install any of
Docker
,Singularity
orConda
If you use the conda package manager you can setup a new environment and install all dependencies for the new module workflow in one step with:
and proceed with Step 5.
-
Install
Nextflow
(>=21.04.0
) -
Install the latest version of
nf-core/tools
(>=2.7
) -
Install
nf-test
-
Setup up pre-commit (comes packaged with
nf-core/tools
, watch the pre-commit bytesize talk if you want to know more about it) to ensure that your code is linted and formatted correctly before you commit it to the repository
-
Set up git on your computer by adding a new git remote of the main nf-core git repo called
upstream
Make a new branch for your module and check it out
-
Create a module using the nf-core DSL2 module template:
All of the files required to add the module to
nf-core/modules
will be created/edited in the appropriate places. There are at most 3 files to modify:-
./modules/nf-core/fastqc/main.nf
This is the main script containing the
process
definition for the module. You will see an extensive number ofTODO
statements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions. -
./modules/nf-core/fastqc/meta.yml
This file will be used to store general information about the module and author details - the majority of which will already be auto-filled. However, you will need to add a brief description of the files defined in the
input
andoutput
section of the main script since these will be unique to each module. We check it’s formatting and validity based on a JSON schema during linting (and in the pre-commit hook). -
./modules/nf-core/fastqc/tests/main.nf.test
Every module MUST have a test workflow. This file will define one or more Nextflow
workflow
definitions that will be used to unit test the output files created by the module. By default, oneworkflow
definition will be added but please feel free to add as many as possible so we can ensure that the module works on different data types / parameters e.g. separateworkflow
for single-end and paired-end data.Minimal test data required for your module may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.
Refer to the section writing nf-test tests for more information on how to write nf-tests
-
-
Create a snapshot of the tests
NoteSee the nf-test docs if you would like to run the tests manually.
-
Check that the new module you’ve added follows the module specifications
-
Lint the module locally to check that it adheres to nf-core guidelines before submission
-
Once ready, the code can be pushed and a pull request (PR) created
On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows
Once you are ready you can push the code and create a PR
Once the PR has been accepted you should delete the branch and checkout master again.
-
Set up git on your computer by adding a new git remote of the main nf-core git repo called
upstream
Make a new branch for your subworkflow and check it out
-
Create a subworkflow using the nf-core DSL2 subworkflow template in the root of the clone of the nf-core/modules repository:
All of the files required to add the subworkflow to
nf-core/modules
will be created/edited in the appropriate places. There are at most 3 files to modify:-
./subworkflows/nf-core/bam_sort_stats_samtools/main.nf
This is the main script containing the
workflow
definition for the subworkflow. You will see an extensive number ofTODO
statements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions. -
./subworkflows/nf-core/bam_sort_stats_samtools/meta.yml
This file will be used to store general information about the subworkflow and author details. You will need to add a brief description of the files defined in the
input
andoutput
section of the main script since these will be unique to each subworkflow. -
./subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.test
Every subworkflow MUST have a test workflow. This file will define one or more Nextflow
workflow
definitions that will be used to unit test the output files created by the subworkflow. By default, oneworkflow
definition will be added but please feel free to add as many as possible so we can ensure that the subworkflow works on different data types / parameters e.g. separateworkflow
for single-end and paired-end data.Minimal test data required for your subworkflow may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.
Refer to the section writing nf-test tests for more information on how to write nf-tests
-
-
Create a snapshot of the tests
NoteSee the nf-test docs if you would like to run the tests manually.
-
Check that the new subworkflow you’ve added follows the subworkflow specifications
-
Lint the subworkflow locally to check that it adheres to nf-core guidelines before submission
- Once ready, the code can be pushed and a pull request (PR) created
On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request - see below. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows:
Once you are ready you can push the code and create a PR
Once the PR has been accepted you should delete the branch and checkout master again.
Test data
In order to test that each component added to nf-core/modules
is actually working and to be able to track any changes to results files between component updates we have set-up a number of Github Actions CI tests to run each module on a minimal test dataset using Docker, Singularity and Conda.
Please adhere to the test-data specifications when adding new test-data
If a new test dataset is added to tests/config/test_data.config
, check that the config name of the added file(s) follows the scheme of the entire file name with dots replaced with underscores.
For example: the nf-core/test-datasets file genomics/sarscov2/genome/genome.fasta
labelled as genome_fasta
, or genomics/sarscov2/genome/genome.fasta.fai
as genome_fasta_fai
.
Using a stub test when required test data is too big
If the module absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the module using larger scale data and document how this is done. In addition, an extra script-block labeled stub:
must be added, and this block must create dummy versions of all expected output files as well as the versions.yml
. An example is found in the ascat module.
In the test.yml
the -stub-run
argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example):
Using a stub test when required test data is too big
If the subworkflow absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the subworkflow using larger scale data and document how this is done. In addition, an extra script-block labeled stub:
must be added, and this block must create dummy versions of all expected output files as well as the versions.yml
. An example is found in the [bam_sort_stats_samtools subworkflow](
In the test.yml
the -stub-run
argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example)
Uploading to nf-core/modules
When you are happy with your pull request, please select the Ready for Review
label on the GitHub PR tab, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible. We also recommend to request reviews from the nf-core/modules-team
so a core team of volunteers can try to review your PR as fast as possible.
Once you are familiar with the module submission process, please consider joining the reviewing team by asking on the #modules
slack channel.
Writing tests
nf-core components are tested using nf-test. See the page on writing nf-test tests for more information and examples.
Publishing results
Results are published using Nextflow’s native publishDir
directive defined in the modules.config
of a workflow (see here for an example.)
Help
For further information or help, don’t hesitate to get in touch on Slack #modules
channel (you can join with this invite).