nf-core/ampliseq
Amplicon sequencing analysis workflow using DADA2 and QIIME2
2.7.0
). The latest
stable release is
2.14.0
.
Path to tab-separated sample sheet
string
Path to ASV/OTU fasta file
string
Path to folder containing zipped FastQ files
string
Forward primer sequence
string
Reverse primer sequence
string
Path to metadata sheet, when missing most downstream analysis are skipped (barplots, PCoA plots, …).
string
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
If data has binned quality scores such as Illumina NovaSeq
boolean
If data is single-ended PacBio reads instead of Illumina
boolean
If data is single-ended IonTorrent reads instead of Illumina
boolean
If data is single-ended Illumina reads instead of paired-end
boolean
If analysing ITS amplicons or any other region with large length variability with Illumina paired end reads
boolean
If samples were sequenced in multiple sequencing runs
boolean
Naming of sequencing files
string
/*_R{1,2}_001.fastq.gz
Set read count threshold for failed samples.
integer
1
Ignore input files with too few reads.
boolean
Cutadapt will retain untrimmed reads, choose only if input reads are not expected to contain primer sequences.
boolean
Sets the minimum overlap for valid matches of primer sequences with reads for cutadapt (-O).
integer
3
Sets the maximum error rate for valid matches of primer sequences with reads for cutadapt (-e).
number
0.1
Cutadapt will be run twice to ensure removal of potential double primers
boolean
Ignore files with too few reads after trimming.
boolean
DADA2 read truncation value for forward strand, set this to 0 for no truncation
integer
DADA2 read truncation value for reverse strand, set this to 0 for no truncation
integer
If —trunclenf and —trunclenr are not set, these values will be automatically determined using this median quality score
integer
25
Assures that values chosen with —trunc_qmin will retain a fraction of reads.
number
0.75
DADA2 read filtering option
integer
2
DADA2 read filtering option
integer
50
DADA2 read filtering option
integer
Ignore files with too few reads after quality filtering.
boolean
Mode of sample inference: “independent”, “pooled” or “pseudo”
string
Not recommended: When paired end reads are not sufficiently overlapping for merging.
boolean
Name of supported database, and optionally also version number
string
Path to a custom DADA2 reference taxonomy database
string
Path to a custom DADA2 reference taxonomy database for species assignment
string
Comma separated list of taxonomic levels used in DADA2’s assignTaxonomy function
string
If the expected amplified sequences are extracted from the DADA2 reference taxonomy database
boolean
Newick file with reference phylogenetic tree. Requires also --pplace_aln
and --pplace_model
.
string
File with reference sequences. Requires also --pplace_tree
and --pplace_model
.
string
Phylogenetic model to use in placement, e.g. ‘LG+F’ or ‘GTR+I+F’. Requires also --pplace_tree
and --pplace_aln
.
string
Method used for alignment, “hmmer” or “mafft”
string
Tab-separated file with taxonomy assignments of reference sequences.
string
A name for the run
string
Name of supported database, and optionally also version number
string
Path to QIIME2 trained classifier file (typically *-classifier.qza)
string
Name of supported database, and optionally also version number
string
Path to a custom Kraken2 reference taxonomy database (.tar.gz|.tgz archive or folder)
string
Comma separated list of taxonomic levels used in Kraken2. Will overwrite default values.
string
Confidence score threshold for taxonomic classification.
number
Name of supported database, and optionally also version number
string
If ASVs should be assigned to UNITE species hypotheses (SHs). Only relevant for ITS data.
boolean
Part of ITS region to use for taxonomy assignment: “full”, “its1”, or “its2”
string
Cutoff for partial ITS sequences. Only full sequences by default.
integer
Post-cluster ASVs with VSEARCH
boolean
Pairwise Identity value used when post-clustering ASVs if --vsearch_cluster
option is used (default: 0.97).
number
0.97
Enable SSU filtering. Comma separated list of kingdoms (domains) in Barrnap, a combination (or one) of “bac”, “arc”, “mito”, and “euk”. ASVs that have their lowest evalue in that kingdoms are kept.
string
Minimal ASV length
integer
Maximum ASV length
integer
Filter ASVs based on codon usage
boolean
Starting position of codon tripletts
integer
1
Ending position of codon tripletts
integer
Define stop codons
string
TAA,TAG
Comma separated list of unwanted taxa, to skip taxa filtering use “none”
string
mitochondria,chloroplast
Abundance filtering
integer
1
Prevalence filtering
integer
1
Comma separated list of metadata column headers for statistics.
string
Comma separated list of metadata column headers for plotting average relative abundance barplots.
string
Formula for QIIME2 ADONIS metadata feature importance test for beta diversity distances
string
If the functional potential of the bacterial community is predicted.
boolean
If data should be exported in SBDI (Swedish biodiversity infrastructure) Excel format.
boolean
Minimum rarefaction depth for diversity analysis. Any sample below that threshold will be removed.
integer
500
Minimum sample counts to retain a sample for ANCOM analysis. Any sample below that threshold will be removed.
integer
1
Minimum taxonomy agglomeration level for taxonomic classifications
integer
2
Maximum taxonomy agglomeration level for taxonomic classifications
integer
6
Path to Markdown file (Rmd)
string
${projectDir}/assets/report_template.Rmd
Path to style file (css)
string
${projectDir}/assets/nf-core_style.css
Path to logo file (png)
string
${projectDir}/assets/nf-core-ampliseq_logo_light_long.png
String used as report title
string
Summary of analysis results
Path to Markdown file (md) that replaces the ‘Abstract’ section
string
Skip FastQC
boolean
Skip primer trimming with cutadapt. This is not recommended! Use only in case primer sequences were removed before and the data does not contain any primer sequences.
boolean
Skip quality check with DADA2. Can only be skipped when --trunclenf
and --trunclenr
are set.
boolean
Skip annotating SSU matches.
boolean
Skip all steps that are executed by QIIME2, including QIIME2 software download, taxonomy assignment by QIIME2, barplots, relative abundance tables, diversity analysis, differential abundance testing.
boolean
Skip taxonomic classification. Incompatible with --sbdiexport
boolean
Skip taxonomic classification with DADA2
boolean
Skip species level when using DADA2 for taxonomic classification. This reduces the required memory dramatically under certain conditions. Incompatible with --sbdiexport
boolean
Skip producing barplot
boolean
Skip producing any relative abundance tables
boolean
Skip alpha rarefaction
boolean
Skip alpha and beta diversity analysis
boolean
Skip differential abundance testing
boolean
Skip MultiQC reporting
boolean
Skip Markdown summary report
boolean
Less common options for the pipeline, typically set in a config file.
Specifies the random seed.
integer
100
Display help text.
boolean
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
Validation of parameters fails when an unrecognised parameter is found.
boolean
Validation of parameters in lenient more.
boolean
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|d|day)\s*)+$
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string