Project structure

The templates directory

The templates directory in the Nextflow project root can be used to store template files.

├── templates
│   └── sayhello.sh
└── main.nf

Template files can be invoked like regular scripts from any process in your pipeline using the template function. Variables prefixed with the dollar character ($) are interpreted as Nextflow variables when the template file is executed by Nextflow.

See Template files for more information about utilizing template files.

The bin directory

The bin directory in the Nextflow project root can be used to store executable scripts.

├── bin
│   └── sayhello.py
└── main.nf

The bin directory allows binary scripts to be invoked like regular commands from any process in your pipeline without using an absolute path of modifying the PATH environment variable. Each script should include a shebang to specify the interpreter and inputs should be supplied as arguments to the executable. For example:

#!/usr/bin/env python

import argparse

def main():
    parser = argparse.ArgumentParser(description="A simple argparse example.")
    parser.add_argument("name", type=str, help="Person to greet.")
    
    args = parser.parse_args()
    print(f"Hello {args.name}!")

if __name__ == "__main__":
    main()

Tip

Use env to resolve the interpreter’s location instead of hard-coding the interpreter path.

Binary scripts placed in the bin directory must have executable permissions. Use chmod to grant the required permissions. For example:

chmod a+x bin/sayhello.py

Binary scripts in the bin directory can then be invoked like regular commands.

process sayHello {
    
    input:
    val x

    output:
    stdout

    script:
    """
    sayhello.py --name $x
    """
}

workflow {
    Channel.of("Foo") | sayHello | view
}

Like modifying a process script, modifying the binary script will cause the task to be re-executed on a resumed run.

Note

Binary scripts require a local or shared file system for the pipeline work directory or Wave containers when using cloud-based executors.

Warning

When using containers and the Wave service, Nextflow will send the project-level bin directory to the Wave service for inclusion as a layer in the container. Any changes to scripts in the bin directory will change the layer md5sum and the hash for the final container. The container identity is a component of the task hash calculation and will force re-calculation of all tasks in the workflow.

When using the Wave service, use module-specific bin directories instead. See Module binaries for more information.

The lib directory

The lib directory can be used to add utility code or external libraries without cluttering the pipeline scripts. The lib directory in the Nextflow project root is added to the classpath by default.

├── lib
│   └── DNASequence.groovy
└── main.nf

Classes or packages defined in the lib directory will be available in the execution context. Scripts or functions defined outside of classes will not be available in the execution context.

For example, lib/DNASequence.groovy defines the DNASequence class:

// lib/DNASequence.groovy
class DNASequence {
    String sequence

    // Constructor
    DNASequence(String sequence) {
        this.sequence = sequence.toUpperCase() // Ensure sequence is in uppercase for consistency
    }

    // Method to calculate melting temperature using the Wallace rule
    double getMeltingTemperature() {
        int g_count = sequence.count('G')
        int c_count = sequence.count('C')
        int a_count = sequence.count('A')
        int t_count = sequence.count('T')

        // Wallace rule calculation
        double tm = 4 * (g_count + c_count) + 2 * (a_count + t_count)
        return tm
    }

    String toString() {
        return "DNA[$sequence]"
    }
}

The DNASequence class is available in the execution context:

// main.nf
workflow {
    Channel.of('ACGTTGCAATGCCGTA', 'GCGTACGGTACGTTAC')
    .map { seq -> new DNASequence(seq) }
    .view { dna -> 
        def meltTemp = dna.getMeltingTemperature()
        "Found sequence '$dna' with melting temperature ${meltTemp}°C" 
    }
}

It returns:

Found sequence 'DNA[ACGTTGCAATGCCGTA]' with melting temperaure 48.0°C
Found sequence 'DNA[GCGTACGGTACGTTAC]' with melting temperaure 50.0°C