Skip to content
Snippets Groups Projects

Nextflow

pipeline status

The wf-illumina-nf pipeline

This pipeline performes the QC of data from Illumina sequencers.

How tu use it ?

The pipeline begin after the NGS_Illumina pipeline, which, at the end performes the demultiplexing of raw data. In the output directory of demultiplexing, five elements are needed :

  • one fastq files folder per project
  • the SampleSheet.csv
  • the nextflow outputs folder
  • the params.config file
  • the fastqScreen configration file

An example of the params.config and fastqScreen are available in the assets folder.

Example of a basic command line the launch the pipeline is (from the nextflow folder) :

sbatch -J nf-illumina_BHNKY7DRX2_1 -p wflowq -t 3-00 --mem 5GB --wrap="module load bioinfo/Nextflow-v21.04.1; cd /home/sbsuser/work/data/NovaSeq/230116_A00318_0372_BHNKY7DRX2_Lane1_1673933427_10x/nextflow; nextflow run /work/sbsuser/test/jules/VisualStudioSources/wf-illumina-nf/main.nf -profile prod -ansi-log false -params-file ../params.yml"

Tha YAML parameter file must looks like :

inputdir: "/home/sbsuser/work/Nextflow/wf-illumina-nf/data_test/NovaSeq/211129_A00318_0259_AHNMTTDSX2_Lane1_1638345606_dna"
project: 'GwOAK_small'
is_multiplex: true
data_nature: "DNA"
pairedEnd: true
reference_genome: "/save/ng6/TODO/HiSeqIndexedGenomes/new_struct/Quercus_robur/genome/GCA_900291515.1/BWA/GCA_900291515.1_Q_robur_v1_genomic.fna"
addBankForConta: ""
run_name: "ContaComparison"
sequencer: "NovaSeq"
run_date: "2022"
machine_id: "NOVA"
fc_id: "HNMTTDSX2"
lane: "1"

NB : for the moment, the case of multi-projects lane is not managed !