Skip to content
Snippets Groups Projects
README.md 3.11 KiB
Newer Older
Ludovic Duvaux's avatar
Ludovic Duvaux committed
# <A HREF="https://forgemia.inra.fr/asm4pg/GenomAsm4pg"> asm4pg </A>
This is an automatic and reproducible genome assembly workflow for pangenomic applications using PacBio HiFi data.
Sukanya Denni's avatar
Sukanya Denni committed

This workflow uses [Snakemake](https://snakemake.readthedocs.io/en/stable/) to quickly assemble genomes with a HTML report summarizing obtained assembly stats.
Sukanya Denni's avatar
Sukanya Denni committed

PIAT LUCIEN's avatar
PIAT LUCIEN committed
![workflow DAG](doc/dag.svg)
Sukanya Denni's avatar
Sukanya Denni committed

Sukanya Denni's avatar
Sukanya Denni committed
## Repo directory structure
Sukanya Denni's avatar
Sukanya Denni committed

```
Sukanya Denni's avatar
Sukanya Denni committed
├── README.md
├── job.sh
PIAT LUCIEN's avatar
PIAT LUCIEN committed
├── local_run.sh
├── doc
Sukanya Denni's avatar
Sukanya Denni committed
├── workflow
│   ├── scripts
|   └── Snakefile
Sukanya Denni's avatar
Sukanya Denni committed
└──  .config
    ├── snakemake_profile
    └── masterconfig.yaml
Sukanya Denni's avatar
Sukanya Denni committed
```

## Requirements
PIAT LUCIEN's avatar
PIAT LUCIEN committed
Miniforge (Snakemake), Singularity/Apptainer
PIAT LUCIEN's avatar
PIAT LUCIEN committed
## How to Use
### 1. Set up
Clone the Git repository
```bash
git clone https://forgemia.inra.fr/asm4pg/GenomAsm4pg.git && cd GenomAsm4pg
```
> All other tools will be run in Singularity/Apptainer images automatically downloaded by Snakemake. Total size of the images is ~5.5G
PIAT LUCIEN's avatar
PIAT LUCIEN committed
### 2. Configure the pipeline
- Edit the `masterconfig` file in the `.config/` directory with your sample information. 

PIAT LUCIEN's avatar
PIAT LUCIEN committed
### 3. Run the workflow 
PIAT LUCIEN's avatar
PIAT LUCIEN committed
#### <ins>A. On a HPC</ins>
Lucien Piat's avatar
Lucien Piat committed
- Edit `job.sh` with path to the modules `Singularity/Apptainer`, `Miniforge`
- Provide and environment with `Snakemake` and `snakemake-executor-plugin-slurmin` in `job.sh`, under `source activate wf_env`, you can create it like this : 
PIAT LUCIEN's avatar
PIAT LUCIEN committed
```bash
conda create -n wf_env -c conda-forge -c bioconda snakemake=8.4.7 snakemake-executor-plugin-slurm
```  
> Use Miniforge with the conda-forge channel, see why [here](https://science-ouverte.inrae.fr/fr/offre-service/fiches-pratiques-et-recommandations/quelles-alternatives-aux-fonctionnalites-payantes-danaconda) (french)
PIAT LUCIEN's avatar
PIAT LUCIEN committed
- Add the log directory for SLURM 
```bash
mkdir slurm_logs
```
PIAT LUCIEN's avatar
PIAT LUCIEN committed
- Run the workflow :
```bash
sbatch job.sh dry # Check for warnings
sbatch job.sh run # Then
```
> **Nb 1:** If your account name can't be automatically determined, add it in the `.config/snakemake/profiles/slurm/config.yaml` file.
#### <ins>B. Locally</ins>
- Make sure you have Snakemake and Singularity/Apptainer installed
PIAT LUCIEN's avatar
PIAT LUCIEN committed
- Run the workflow :
```bash
./local_run dry # Check for warnings
./local_run job.sh run # Then
```
## Input Conversion
Currently, asm4pg requires `fasta.gz` files. To convert your `fastq` or `bam` files to this format, you can use the following tools:
```bash
./workflow/scripts/input_conversion.sh -i <input_file> -o <output_file>
```

PIAT LUCIEN's avatar
PIAT LUCIEN committed
## Using the full potential of the workflow :
Asm4pg has many options. If you wish to modify the default values and know more about the workflow, please refer to the [documentation](doc/documentation.md)
Ludovic Duvaux's avatar
Ludovic Duvaux committed

PIAT LUCIEN's avatar
PIAT LUCIEN committed
## How to cite asm4pg?
Ludovic Duvaux's avatar
Ludovic Duvaux committed

We are currently writing a publication about asm4pg. Meanwhile, if you use the pipeline, please cite it using the address of this repository. 

PIAT LUCIEN's avatar
PIAT LUCIEN committed
## License
Ludovic Duvaux's avatar
Ludovic Duvaux committed
The content of this repository is licensed under <A HREF="https://choosealicense.com/licenses/gpl-3.0/">(GNU GPLv3)</A> 

PIAT LUCIEN's avatar
PIAT LUCIEN committed
## Contacts
For any troubleshooting, issue or feature suggestion, please use the issue tab of this repository.
Ludovic Duvaux's avatar
Ludovic Duvaux committed
For any other question or if you want to help in developing asm4pg, please contact Ludovic Duvaux at ludovic.duvaux@inrae.fr