NextFlow Reporting
Nextflow Reporting
Logging and Provenance
The log
command in Nextflow can provide valuable information about a pipeline execution. This information can be used to track the provenance of a workflow result.
To fetch the list of run names or session IDs, the nextflow log
command can be utilized:
nextflow log
This command will produce a summary of the executions log and runtime information for all executed pipelines. The summary includes date and time of the run, duration, run name, run status, a revision ID, the session id, and the command executed on the command line.
To get more detailed information about an individual run, we can append the run name or session ID to the log
command:
nextflow log <run_name_or_session_ID>
For example:
nextflow log tiny_fermat
This will list the working directory for each process involved in the run.
Task ID
Each task in Nextflow is assigned a unique ID, generated as a 128-bit hash number. This hash is derived from a combination of the task’s:
- Input values
- Input files
- Command line string
- Container ID
- Conda environment
- Environment modules
- Any executed scripts in the bin directory
Additional Metadata with the Fields Option
The log
command in Nextflow also allows for the retrieval of additional metadata. By using the -f
or --fields
option followed by a comma-separated list of fields, we can output specific details about each process in the pipeline.
For example:
nextflow log tiny_fermat -f 'process,exit,hash,duration'
This command will output the process name, exit status, hash, and duration of each process for the tiny_fermat
run to the terminal.
The complete list of available fields can be retrieved with the command:
nextflow log -l
attempt complete container cpus disk duration env error_action exit hash inv_ctxt log memory module name native_id pcpu peak_rss peak_vmem pmem process queue rchar read_bytes realtime rss scratch script start status stderr stdout submit syscr syscw tag task_id time vmem vol_ctxt wchar workdir write_bytes
Script
If we want a log of all the commands executed in the pipeline we can use the script field. It is important to note that the resultant output can not be used to run the pipeline steps.
Filtering
The output from the log command can be very long. We can subset the output using the option -F (filter) specifying the filtering criteria. This will print only those tasks matching a pattern using the syntax ~=/
For example to filter for process with the name fastqc we would run:
nextflow log tiny_fermat -F 'process =~ /fastqc/'
/data/.../work/c1/56a36d8f498c99ac6cba31e85b3e0c /data/.../work/f7/659c65ef60582d9713252bcfbcc310
This can be useful to locate specific tasks work directories.
Exercise
View run log Use the Nextflow log command specifying a run name and the fields. name, hash, process and status ```groovy ```Solution
```bash nextflow log elegant_descartes -f name,hash,process,status ```Exercise
Filter pipeline run log Use the -F option and a regular expression to filter the for a specific process. ```groovy ```Solution
```bash nextflow log elegant_descartes -f name,hash,process,status -F 'process =~ /multiqc/' ```Templates
The -t option allows a template (string or file) to be specified. This makes it possible to create a custom report in any text based format.
For example you could save this markdown snippet to a file:
## $name
script:
$script
exist status: $exit
task status: $status
task folder: $folder
Then, the following log command will output a markdown file containing the script, exit status and folder of all executed tasks:
nextflow log elegant_descartes -t my-template.md > execution-report.md
Or, the template file can also be written in HTML.
For example:
<div>
<h2>${name}</h2>
<div>
Script:
<pre>${script}</pre>
</div>
<ul>
<li>Exit: ${exit}</li>
<li>Status: ${status}</li>
<li>Work dir: ${workdir}</li>
<li>Container: ${container}</li>
</ul>
</div>
By saving the above snippet in a file named template.html, you can run the following command:
nextflow log elegant_descartes -t template.html > provenance.html
To view the report open it in a browser.
Exercise
Generate an HTML run report Generate an HTML report for a run using the -t option and the template.html file. ```groovy ```Solution
```bash nextflow log elegant_descartes -t template.html > provenance.html ```Key Points
- Nextflow can produce a custom execution report with run information using the log command.
- You can generate a report using the -t option specifying a template file.