pangenome: Parameters

Define where the pipeline should find input data and save output data.

Path to BGZIPPED input FASTA to build the pangenome graph from.

required

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

The number of haplotypes in the input FASTA.

required

type: number

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Options for the all versus all alignment phase.

Percent identity in the wfmash mashmap step.

type: number

default: 90

Segment length for mapping.

type: string

default: 5000

Minimum block length filter for mapping.

type: string

Kmer size for mashmap.

type: integer

default: 19

Ignore the top % most-frequent kmers.

type: number

default: 0.001

Keep this fraction of mappings (auto for giant component heuristic).

type: string

default: 1.0

pattern: (auto|[01]\.\d+)

Merge successive mappings.

type: boolean

Disable splitting of input sequences during mapping.

hidden

type: boolean

Skip mappings between sequences with the same name prefix before the given delimiter character. This can be helpful if several sequences originate from the same chromosome. It is recommended that the sequence names respect the https://github.com/pangenome/PanSN-spec. In future versions of the pipeline it will be required that the sequence names follow this specification.

type: string

Set the directory where temporary files should be stored. Since everything runs in containers, we don’t usually set this argument.

hidden

type: string

The number of files to generate from the approximate wfmash mappings to scale across a whole cluster. It is recommended to set this to the number of available nodes. If only one machine is available, leave it at 1.

type: integer

default: 1

If this parameter is set, only the wfmash alignment step of the pipeline is executed. This option is offered for users who want to run wfmash on a cluster.

type: boolean

Filter out mappings unlikely to be this Average Nucleotide Identity (ANI) less than the best mapping.

type: integer

default: 30

Number of mappings for each segment. [default: n_haplotypes - 1].

type: integer

Ignores exact matches below this length.

type: integer

default: 23

Number of base pairs to use for transitive closure batch.

type: string

default: 10000000

Keep this randomly selected fraction of input matches.

type: number

Set the directory where temporary files should be stored. Since everything runs in containers, we don’t usually set this argument.

hidden

type: string

Input PAF file. The wfmash alignment step is skipped.

type: string

Options for graph smoothing phase.

Skip the graph smoothing step of the pipeline.

type: boolean

Maximum path jump to include in the block.

hidden

type: integer

Maximum edge jump before a block is broken.

hidden

type: integer

Maximum sequence length to put int POA. Is a comma-separated list. For each integer, SMOOTHXG wil be executed once.

type: string

default: 700,900,1100

Minimum edit-based identity to cluster sequences.

hidden

type: string

Minimum ‘smallest / largest’ sequence length ration to cluster in a block.

hidden

type: integer

Path depth at which we don’t pad the POA problem.

type: integer

default: 100

Pad each end of each seuqence in POA with ‘smoothxg_poa_padding * longest_poa_seq’ base pairs.

type: number

default: 0.001

Score parameters for POA in the form of ‘match,mismatch,gap1,ext1,gap2,ext2’. It may also be given as presets: ‘asm5’, ‘asm10’, ‘asm15’, ‘asm20’. [default: 1,19,39,3,81,1 = asm5].

type: string

default: 1,19,39,3,81,1

Write MAF output representing merged POA blocks.

type: boolean

Use this prefix for consensus path names.

hidden

type: string

default: Consensus_

Set the directory where temporary files should be stored. Since everything runs in containers, we don’t usually set this argument.

hidden

type: string

Keep intermediate graphs during SMOOTHXG.

hidden

type: boolean

Run abPOA. [default: SPOA].

type: boolean

Run the POA in global mode. [default: local mode].

type: boolean

Number of CPUs for the potentially very memory expensive POA phase of SMOOTHXG. Default is ‘task.cpus’.

type: integer

Options for calling variants against reference(s).

Specify a set of VCFs to produce with --vcf_spec "REF[:LEN][,REF[:LEN]]*".

type: string

Options to run the partition algorithm for community detection.

Enable community detection.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Do we want to display hidden parameters?

hidden

type: boolean

Do we want to display hidden parameters?

hidden

type: string

default: igenomes_base

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

nf-core/pangenome