nf-core/airrflow
B-cell and T-cell Adaptive Immune Receptor Repertoire (AIRR) sequencing analysis pipeline using the Immcantation framework
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string
^\S+\.tsv$
Specify the processing mode for the pipeline. Available options are “fastq” and “assembled”.
string
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Path to MiAIRR-BioSample mapping
string
${projectDir}/assets/reveal/mapping_MiAIRR_BioSample_v1.3.1.tsv
Experimental protocol used to generate the data
Protocol used for the V(D)J amplicon sequencing library generation.
string
Path to fasta file containing the linker sequence, if no V-region primers were used but a linker sequence is present (e.g. 5’ RACE SMARTer TAKARA protocol).
string
Define the primer region start and how to deal with the primer alignment.
Path to a fasta file containinc the V-region primer sequences.
string
Path to a fasta file containing the C-region primer sequences.
string
Start position of V region primers (without counting the UMI barcode).
integer
Start position of C region primers (without counting the UMI barcode).
integer
Indicate if C region primers are in the R1 or R2 reads.
string
Specify to match the tail-end of the sequence against the reverse complement of the primers. This also reverses the behavior of the —start argument, such that start position is relative to the tail-end of the sequence. (default: False)Maximum scoring error for the Presto MaxPrimer process for the C and/or V region primers identification.
boolean
Define how UMI barcodes should be treated.
Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.
string
UMI barcode length in nucleotides. Set to 0 if no UMIs present.
integer
-1
UMI barcode start position in the index read.
integer
Indicate if UMI indices are recorded in a separate index file.
boolean
Options for adapter trimming and read clipping
Whether to trim adapters in fastq reads with fastp.
boolean
true
Fasta file with adapter sequences to be trimmed.
string
Number of bases to clip 5’ in R1 reads.
integer
Number of bases to clip 5’ in R2 reads.
integer
Number of bases to clip 3’ in R1 reads.
integer
Number of bases to clip 3’ in R2 reads.
integer
Trim adapters specific for Nextseq sequencing
boolean
Option to save trimmed reads.
boolean
Options for the pRESTO sequence assembly processes
Quality threshold for pRESTO FilterSeq sequence filtering.
integer
20
Maximum error for building the primer consensus in the pRESTO Buildconsensus step.
number
0.6
Maximum error for building the sequence consensus in the pRESTO BuildConsensus step.
number
0.1
Maximum gap for building the sequence consensus in the pRESTO BuildConsensus step.
number
0.5
Cluster sequences by similarity regardless of any annotation with pRESTO ClusterSets and annotate the cluster ID additionally to the UMI barcode.
boolean
true
Maximum allowed error for R1 primer alignment.
number
0.2
Maximum allowed error for R2 primer alignment.
number
0.2
Align primers instead of scoring them. Used for protocols without primer fixed positions.
boolean
Maximum allowed primer length when aligning the primers.
integer
50
Masking mode for R1 primers.
string
Masking mode for R2 primers.
string
Use MaskPrimers align for a 5’ RACE protocol.
boolean
Use when primer sequences are unknown but when their approximate positions are known.
boolean
R1 primer extract length when using --maskprimers_extract
.
integer
R2 primer extract length when using --maskprimers_extract
.
integer
Use AssemblePairs sequential instead of AssemblePairs align when assembling read pairs.
boolean
Align internal C-region for a more precise isotype characterization.
boolean
Provide internal C-region sequences for a more precise C-region characterization. Then also set the align_cregion
flag.
string
Maximum allowed length when aligning the internal C-region.
integer
100
Maximum allowed error when aligning the internal C-region.
number
0.3
Mask mode for C-region alignment.
string
tag
Skip filter step after alignment that ensures that locus should match the v_call chain, the sequence alignment should have at least 200 informative positions (excluding N or gaps), and maximum 10% N nucleotides in the alignment.
boolean
Options specific for raw single cell input.
Path to the reference directory required by cellranger. Can either be directory or tar.gz.
string
Options specific for raw unselected RNA-seq input.
Specifies which read holds the barcodes
string
Indicate if UMI indices are recorded in the R1 (default) or R1 fastq file.
string
Specifies where in the read the barcodes and UMIs can be found.
string
Options for the VDJ annotation processes.
Whether to reassign genes if the input file is an AIRR formatted tabulated file.
boolean
true
Subset to productive sequences.
boolean
true
Save databases so you can use the cache in future runs.
boolean
true
Path to the germline reference fasta.
string
https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/imgtdb_base.zip
Path to the cached igblast database.
string
https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/database-cache/igblast_base.zip
Set this flag to fetch the IMGT reference data at runtime.
boolean
Options for bulk sequence filtering after VDJ assignment.
Name of the field used to collapse duplicated sequences.
string
sample_id
Whether to run the process to detect contamination.
boolean
Whether to apply the chimera removal filter.
boolean
Define how the B-cell clonal trees should be calculated.
Set the clustering threshold Hamming distance value. Default: ‘auto’
string,number
auto
Perform clonal lineage tree analysis.
boolean
Name of the field used to group data files to identify clones.
string
subject_id
Name of the field used to identify external groups used to identify a clonal threshold.
string
subject_id
Lineage tree software to use to build trees within Dowser. If you change the default, also set the lineage_tree_exec
parameter.
string
Path to lineage tree building executable.
string
/usr/local/bin/raxml-ng
Name of the field used to determine if a sample is single cell sequencing or not.
string
single_cell
Skip report of EnchantR DefineClones for all samples together.
boolean
Skip report of EnchantR FindThreshold for all samples together.
boolean
Skip all clonal anlaysis processes
boolean
Options to generate BCR and TCR embeddings with Amulety
Generate a sequence amino acid translation with IgBlast.
boolean
Generate sequence embeddings with amulety.
string
BCR or TCR chains to include for embedding.
string
H
Use GPU to generate embeddings.
boolean
Custom report Rmarkdown file.
string
${projectDir}/assets/repertoire_comparison.Rmd
Custom report style file in css format.
string
${projectDir}/assets/nf-core_style.css
Custom logo for the report.
string
${projectDir}/assets/nf-core-airrflow_logo_light.png
Custom logo for the EnchantR reports.
string
${projectDir}/assets/nf-core-airrflow_logo_reports.png
Skip repertoire analysis and report generation.
boolean
Skip multiqc report.
boolean
Options for the reference genome indices used to align reads.
Do not load the iGenomes reference config.
boolean
true
The base path to the igenomes reference files
string
s3://ngi-igenomes/igenomes/
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Less common options for the pipeline, typically set in a config file.
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Base URL or local path to location of pipeline test dataset files
string
https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/
Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
string