nf-core/scflow
Please consider using/contributing to https://github.com/nf-core/scdownstream
Define where the pipeline should find input data and save output data.
The .tsv file specifying sample matrix filepaths.
string
./refs/Manifest.txt
The .tsv file specifying sample metadata.
string
./refs/SampleSheet.tsv
Optional tsv file containing mappings between ensembl_gene_id’s and gene_names’s
string
https://raw.githubusercontent.com/nf-core/test-datasets/scflow/assets/ensembl_mappings.tsv
Cell-type annotations reference file path
string
https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/28033407/ctd_v1.zip
Optional tsv file specifying manual revisions of cell-type annotations.
string
./conf/celltype_mappings.tsv
Optional list of genes of interest in YML format for plotting of gene expression.
string
./conf/reddim_genes.yml
Input sample species.
string
human
Outputs directory.
string
./results
Parameters for quality-control and thresholding.
The sample sheet column name with unique sample identifiers.
string
manifest
The sample sheet variables to treat as factors.
string
seqdate
Minimum library size (counts) per cell.
integer
250
Maximum library size (counts) per cell.
string
adaptive
Minimum features (expressive genes) per cell.
integer
100
Maximum features (expressive genes) per cell.
string
adaptive
Minimum proportion of counts mapping to ribosomal genes.
number
Maximum proportion of counts mapping to ribosomal genes.
number
1
Maximum proportion of counts mapping to mitochondrial genes.
string
adaptive
Minimum counts for gene expressivity.
integer
2
Minimum cells for gene expressivity.
integer
2
Option to drop unmapped genes.
string
True
Option to drop mitochondrial genes.
string
True
Option to drop ribosomal genes.
string
false
The number of MADs for outlier detection.
number
4
Options for profiling ambient RNA/empty droplets.
Enable ambient RNA / empty droplet profiling.
string
true
Upper UMI counts threshold for true cell annotation.
string
auto
Lower UMI counts threshold for empty droplet annotation.
integer
100
The maximum FDR for the emptyDrops algorithm.
number
0.001
Number of Monte Carlo p-value iterations.
integer
10000
Expected number of cells per sample.
integer
3000
Parameters for identifying singlets/doublets/multiplets.
Enable doublet/multiplet identification.
string
true
Algorithm to use for doublet/multiplet identification.
string
doubletfinder
Variables to regress out for dimensionality reduction.
string
nCount_RNA,pc_mito
Number of PCA dimensions to use.
integer
10
The top n most variable features to use.
integer
2000
A fixed doublet rate.
number
Doublets per thousand cells increment.
integer
8
Specify a pK value instead of parameter sweep.
number
0.02
Parameters used in the merged quality-control report.
Numeric variables for inter-sample metrics.
string
total_features_by_counts,total_counts,pc_mito,pc_ribo
Categorical variables for further sub-setting of plots
string
NULL
Numeric variables for outlier identification.
string
total_features_by_counts,total_counts
Parameters for integrating datasets and batch correction.
Choice of integration method.
string
Liger
Unique sample identifier variable.
string
manifest
Fill out matrices with union of genes.
string
false
Remove non-expressing cells/genes.
string
true
Number of genes to find for each dataset.
integer
3000
How to combine variable genes across experiments.
string
union
Keep unique genes.
string
false
Capitalize gene names to match homologous genes.
string
false
Treat each column as a cell.
string
true
Inner dimension of factorization (n factors).
integer
30
Regularization parameter.
number
5
Convergence threshold.
number
0.0001
Maximum number of block coordinate descent iterations.
integer
100
Number of restarts to perform.
integer
1
Random seed for reproducible results.
integer
1
Number of neearest neighbours for within-dataset knn graph.
integer
20
Horizon parameter for shared nearest factor graph.
integer
500
Minimum allowed edge weight.
number
0.2
Name of dataset to use as a reference.
string
NULL
Minimum number of cells to consider a cluster shared across datasets.
integer
2
Number of quantiles to use for normalization.
integer
50
Number of times to perform Louvain community detection.
integer
10
Controls the number of communities detected.
integer
1
Indices of factors to use for shared nearest factor determination.
string
NULL
Distance metric to use in calculating nearest neighbour.
string
CR
Center the data when scaling factors.
string
false
Small cluster extraction cells threshold.
integer
Categorical variables for integration report metrics.
string
individual,diagnosis,region,sex
Reduced dimension embedding for the integration report.
string
UMAP
Settings for dimensionality reduction algorithms.
Input matrix for dimension reduction.
string
PCA,Liger
Dimension reduction outputs to generate.
string
tSNE,UMAP,UMAP3D
Variables to regress out before dimension reduction.
string
nCount_RNA,pc_mito
Number of PCA dimensions.
integer
30
Number of nearest neighbours to use.
integer
35
The dimension of the space to embed into.
integer
2
Type of initialization for the coordinates.
string
Distance metric for finding nearest neighbours.
string
Number of epochs to us during optimization of embedded coordinates.
integer
200
Initial learning rate used in optimization of coordinates.
integer
1
Effective minimum distance between embedded points.
number
0.4
Effective scale of embedded points.
number
0.85
Interpolation to combine local fuzzy sets.
number
1
Local connectivity required.
integer
1
Weighting applied to negative samples in embedding optimization.
integer
1
Number of negative edge samples to use per positive edge sample.
integer
5
Use fast SGD.
string
false
Output dimensionality.
integer
2
Number of dimensions retained in the initial PCA step.
integer
50
Perplexity parameter.
integer
150
Speed/accuracy trade-off.
number
0.5
Iteration after which perplexities are no longer exaggerated.
integer
250
Iteration after which the final momentum is used.
integer
250
Number of iterations.
integer
1000
Center data before PCA.
string
true
Scale data before PCA.
string
false
Normalize data before distance calculations.
string
true
Momentum used in the first part of optimization.
number
0.5
Momentum used in the final part of optimization.
number
0.8
Learning rate.
integer
1000
Exaggeration factor used in the first part of the optimization.
integer
12
Parameters used to tune louvain/leiden clustering.
Clustering method.
string
leiden
Reduced dimension input(s) for clustering.
string
UMAP_Liger
The resolution of clustering.
number
0.001
Integer number of nearest neighbours for clustering.
integer
50
The number of iterations for clustering.
integer
1
Parameters used for cell-type annotation and the associated report.
SingleCellExperiment clusters colData variable name.
string
clusters
Max cells to sample.
integer
10000
A sample metadata unique sample ID.
string
individual
SingleCellExperiment cell-type colData variable name.
string
cluster_celltype
Cell-type metrics for categorical variables.
string
manifest,diagnosis,sex,capdate,prepdate,seqdate
Cell-type metrics for numeric variables.
string
pc_mito,pc_ribo,total_counts,total_features_by_counts
Number of top marker genes for plot/table generation.
integer
5
Parameters for differential gene expression.
Differential gene expression method.
string
MASTZLM
MAST method.
string
Expressive gene minimum counts.
integer
1
Expressive gene minimum cells fraction.
number
0.1
Re-scale numeric covariates.
string
true
Pseudobulked differential gene expression.
string
false
Cell-type annotation variable name.
string
cluster_celltype
Unique sample identifier variable.
string
manifest
Dependent variable of DGE model.
string
group
Reference class of categorical dependent variable.
string
Control
Confounding variables.
string
cngeneson,seqdate,pc_mito
Random effect confounding variable.
string
NULL
Fold-change threshold for plotting.
number
1.1
Adjusted p-value cutoff.
number
0.05
Force model fit for non-full rank.
string
false
Maximum CPU cores.
string
'null'
Parameters for impacted pathway analysis of differentially expressed genes.
Pathway enrichment tool(s) to use.
string
Enrichment method.
string
ORA
Database(s) to use for enrichment.
string
GO_Biological_Process
Parameters for dirichlet modeling of relative cell-type proportions.
Unique sampler identifier.
string
individual
Cell-type annotation variable name.
string
cluster_celltype
Dependent variable of Dirichlet model.
string
group
Reference class of categorical dependent variable.
string
Control
Dependent variable classes order.
string
Control,Low,High
General parameters for plotting.
Preferred embedding for plots.
string
UMAP_Liger
Point size for reduced dimension plots.
number
0.1
Alpha (transparency) value for reduced dimension plots.
number
0.2
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional configs hostname.
string
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Maximum amount of memory that can be requested for any single job.
string
256.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|day)\s*)+$
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Do not use coloured log outputs.
boolean
Directory to keep pipeline Nextflow logs and reports.
string
${params.outdir}/pipeline_info
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
Run this workflow with Conda. You can also use ‘-profile conda’ instead of providing this parameter.
boolean
Instead of directly downloading Singularity images for use with Singularity, force the workflow to pull and convert Docker containers instead.
boolean
E-mail address for optional workflow completion notification.
string
Send plain-text email instead of HTML.
boolean
NA
string