Nextflow Scripting

Nextflow Scripting


This is part 4 of 14 of a Introduction to NextFlow.


mkdir scripting
cd scripting/

Language Basics

Below are a few basics of the Groovy syntax.

Printing values

Create a new file groovy.nf and add the following:

//groovy.nf
println("Hello, World!")
nextflow run groovy.nf

Output

N E X T F L O W  ~  version 21.04.3
Launching `groovy.nf` [dreamy_jepsen] - revision: 98447b37f7
Hello, World!

Comments

// This is a single line comment. Everything after the // is ignored.

/*
   Comments can also
   span multiple
   lines.
 */

Variables

Groovy knows various types of data. Four common ones are:

A more complete list can be found here.


Add the following example data types to groovy.nf and run nextflow run groovy.nf:

//int − This is used to represent whole numbers.
my_int = 1

//float − This is used to represent floating-point numbers.
my_float = 3.1499392

//Boolean − This represents a Boolean value which can either be true or false.
my_bool = false

//String - These are text literals which are represented in the form of a chain of characters.
my_string = "chr1"

// A block of text that spans multiple lines can be defined by delimiting it with triple single `'''` or double quotes `"""`:
my_text = """
          This is a multi-line
          text using triple quotes.
          """

// To display the value of a variable to the screen in Groovy, we can use the `println` method passing the variable name as a parameter.
println(my_int)
println(my_float)
println(my_bool)
println(my_string)
println(my_text)

// String Interpolation
// To use a variable inside a single or multi-line double-quoted string "", prefix the variable name with a $ to show it should be interpolated.
println("processing chromosome $my_int")
println("value of pi is $my_float")

Output

N E X T F L O W  ~  version 21.04.3
Launching `groovy.nf` [angry_cajal] - revision: 6ebb8cff41
Hello, World!
1
3.1499392
false
chr1

         This is a multi-line
         using triple quotes.

processing chromosome 1
value of pi is 3.1499392

Variable names inside single quoted strings do not support String interpolation.


def

Add the following line to groovy.nf and run nextflow run groovy.nf:

def x = 'local_variable_def'
println(x)

Output

N E X T F L O W  ~  version 21.04.3
Launching `groovy.nf` [compassionate_marconi] - revision: 932bc80461
Hello, World!
1
3.1499392
false
chr1

         This is a multi-line
         text using triple quotes.

processing chromosome 1
value of pi is 3.1499392
local_variable_def

Lists

Create a new lists.nf file; add the following and run nextflow run lists.nf:

kmers = [11,21,27,31]
// You can access a given item in the list with square-bracket notation []. These positions are numbered starting at 0, so the first element has an index of 0.
println(kmers[0])
// Lists can also be indexed with negative indexes
println(kmers[-1])
// The first three elements Lists elements using a range.
println(kmers[0..2])
// String interpolation for Lists - To use an expression like `kmer[0..2]` inside a double quoted String `""`, we use the `${expression}` syntax, similar to Bash/shell scripts
println("The first three elements in the Lists are: ${kmers[0..2]}")

Output

N E X T F L O W  ~  version 21.04.3
Launching `lists.nf` [intergalactic_minsky] - revision: 068f0e0a7c
11
31
[11, 21, 27]
The first three elements in the Lists are: [11, 21, 27]

List Methods

Add the following to lists.nf and run nextflow run lists.nf:

// To get the length of the list
println(kmers.size())

// Inside a string, we need to use the ${} syntax
println("list size is:  ${kmers.size()}")

// To retrieve items in a list
println(kmers.get(1))

Output

N E X T F L O W  ~  version 21.04.3
Launching `lists.nf` [fervent_meninsky] - revision: bf70a3d08a
11
31
[11, 21, 27]
The first three elements in the Lists are: [11, 21, 27]
4
list size is:  4
21

CLICK HERE for more common list methods
```
mylist = [1,2,3]
println mylist
println mylist + [1]
println mylist - [1]
println mylist * 2
println mylist.reverse()
println mylist.collect{ it+3 }
println mylist.unique().size()
println mylist.count(1)
println mylist.min()
println mylist.max()
println mylist.sum()
println mylist.sort()
println mylist.find{it%2 == 0}
println mylist.findAll{it%2 == 0}
```
Output
```
[1, 2, 3]
[1, 2, 3, 1]
[2, 3]
[1, 2, 3, 1, 2, 3]
[3, 2, 1]
[4, 5, 6]
3
1
1
3
6
[1, 2, 3]
2
[2]
```
##

Maps

Create a new maps.nf file; add the following and run nextflow run maps.nf:

roi = [ chromosome : "chr17", start: 7640755, end: 7718054, genes: ['ATP1B2','TP53','WRAP53']]

// Maps can be accessed in a conventional square-bracket syntax or as if the key was a property of the map or using the dot notation. **Note: When retrieving a value, the key value is enclosed in quotes.**
println(roi['chromosome'])

// Use a dot notation
println(roi.start)

// Use the get method
println(roi.get('genes'))

Output

N E X T F L O W  ~  version 21.04.3
Launching `maps.nf` [pedantic_hamilton] - revision: c289105bc9
chr17
7640755
[ATP1B2, TP53, WRAP53]

CLICK HERE for syntax on adding/modifying a map
Add the following to `lists.nf` and run `nextflow run maps.nf`:
```
// Use the square brackets
roi['chromosome'] = 'chr19'

// or

// Use a dot notation        
roi.chromosome = 'chr19'  

// Use the put method              
roi.put('genome', 'hg38')

println("Adding or modifying list results:")
println(roi['chromosome'])
println(roi['genome'])
```
Output
```groovy
N E X T F L O W  ~  version 21.04.3
Launching `maps.nf` [elegant_lumiere] - revision: c3c0dc5ad4
chr17
7640755
[ATP1B2, TP53, WRAP53]
Adding or modifying list results:
chr19
hg38
```
##

Closures

square = { it * it }

Create a new closures.nf file; add the following as an example and nextflow run closures.nf:

square = { it * it }
x = [ 1, 2, 3, 4 ]
println(x)
println(x.collect(square))

Output

N E X T F L O W  ~  version 21.04.3
Launching `closures.nf` [furious_mandelbrot] - revision: 73a27ee028
[1, 2, 3, 4]
[1, 4, 9, 16]

Closure parameters

square = { num -> num * num }

Let’s define a closure to add the prefix chr to each element of the list in a Nextflow script.

Add the following to closures.nf and execute it using nextflow run closures.nf:

prefix = {"chr${it}"}
x = x.collect(prefix)
println x

Output:

N E X T F L O W  ~  version 21.04.3
Launching `closures.nf` [exotic_swirles] - revision: 016a18e635
[1, 2, 3, 4]
[1, 4, 9, 16]
[chr1, chr2, chr3, chr4]

Multiple map parameters

Add the following to closures.nf and execute it using nextflow run closures.nf:

tp53 = [chromosome: "chr17", start: 7661779, end: 7687538, genome: 'GRCh38', gene: "TP53"]
// Perform subtraction of end and start coordinates
region_length = {start, end -> end - start }
// Add the region length to the map tp53
tp53.length = region_length(tp53.start, tp53.end)
println(tp53)

Output

N E X T F L O W  ~  version 21.04.3
Launching `closures.nf` [desperate_kalam] - revision: 9b078bd0b9
[1, 2, 3, 4]
[1, 4, 9, 16]
[chr1, chr2, chr3, chr4]
[chromosome:chr17, start:7661779, end:7687538, genome:GRCh38, gene:TP53, length:25759]

This demonstrates how to work with closures and their parameters in Nextflow, allowing for more flexible and dynamic code.


CLICK HERE for another example In Nextflow, the method `each()` when applied to a map can take a closure with two arguments, to which it passes the key-value pair for each entry in the map object. Add the following to `closures.nf` and execute it using `nextflow run closures.nf`: ```groovy // Closure with two parameters printMap = { a, b -> println "$a with value $b" } // Map object my_map = [ chromosome : "chr17", start : 1, end : 83257441 ] // Each iterates through each element my_map.each(printMap) ``` Output ```groovy N E X T F L O W ~ version 21.04.3 Launching `closures.nf` [cheesy_khorana] - revision: 68b4c2dbbb [1, 2, 3, 4] [1, 4, 9, 16] [chr1, chr2, chr3, chr4] [chromosome:chr17, start:7661779, end:7687538, genome:GRCh38, gene:TP53, length:25759] chromosome with value chr17 start with value 1 end with value 83257441 ```


Learn more about closures in the Groovy documentation.


Conditional Execution

If statement

if( < boolean expression > ) {
   // true branch
}
else {
   // false branch
}

The else branch is optional. Also, curly brackets are optional when the branch defines just a single statement.

Create a new conditional.nf; add the following and execute it using nextflow run conditional.nf:

x = 12
if( x > 10 )
    println "$x is greater than 10"

// Null, empty strings, and empty collections are evaluated to false in Groovy

list = [1, 2, 3]

if( list )
  println list
else
  println 'The list is empty'

Output

N E X T F L O W  ~  version 21.04.3
Launching `cond.nf` [amazing_jennings] - revision: 24d52ebd9f
12 is greater than 10
[1, 2, 3]

More resources

Quick Recap

  • Nextflow is a Domain Specific Language (DSL) implemented on top of the Groovy programming language.
  • To define a variable, simply assign a value to it e.g a = 1.
  • Comments use the same syntax as in the C-family programming languages: // or multiline /* */.
  • Multiple values can be stored in lists [value1, value2, value3, …] or maps [chromosome: 1, start :1].
  • Lists are indexed and sliced with square brackets (e.g., list[0] and list[2..9])
  • A closure is an expression (block of code) encased in {} e.g. {it * it}.

Back to:NF-Core @ HPC Next:NextFlow Channels