Menu

Tree [2cba34] master /
 History

HTTPS access


File Date Author Commit
 references 2021-07-13 Hongshan JIANG Hongshan JIANG [a21244] Initial commit
 scripts 2021-08-23 Hongshan JIANG Hongshan JIANG [2cba34] change the way of quality control
 README.md 2021-07-14 Hongshan JIANG Hongshan JIANG [7160a0] corrected a typo in README.md
 fungae-caiq.conf 2021-08-23 Hongshan JIANG Hongshan JIANG [2cba34] change the way of quality control
 fungae-caiq.py 2021-08-23 Hongshan JIANG Hongshan JIANG [2cba34] change the way of quality control
 requirements 2021-07-13 Hongshan JIANG Hongshan JIANG [a21244] Initial commit

Read Me

FUNGAE-CAIQ: an integrated pipeline based on MaSuRCA and BUCSO for fungal genome assembly and evaluation

prerequisites

installation

  1. install prerequisites, including skewer, MaSuRCA, and BUSCO
  2. install prerequisite python packages
 $ cd fungae-caiq
 $ pip install -r requirements
  1. go to the target directory, e.g. ~/tools/, download the compressed file, and extract it
 $ cd ~/tools
 $ wget -c https://sourceforge.net/projects/fungae-caiq/files/fungae-caiq-v1.0.tar.gz
 $ tar zxvf fungae-caiq-v1.0.tar.gz
  1. download BUSCO databases
 $ cd ~/tools/fungae-caiq/references/busco/busco_downloads
 $ ./download_and_extract.sh
  1. manually check and revise paths in "fungae-caiq/references/busco/config/config.ini.template"
  2. set the environment, add "export PATH=~/tools/fungae-caiq:$PATH" to ~/.bashrc
 $ echo "export PATH=~/tools/fungae-caiq:$PATH" >> ~/.bashrc

structure

fungae-caiq/
├── fungae-caiq.conf Configuration file for FUNGAE-CAIQ
├── fungae-caiq.py Main program for the FUNGAE-CAIQ pipeline
├── README.md This file
├── references Reference databases for PhiX and BUSCO
│   ├── busco Reference files for BUSCO
│   │   ├── busco_downloads Download directory for BUSCO
│   │   │   ├── lineages
│   │   │   │   ├── bacteria_odb10
│   │   │   │   ├── eukaryota_odb10
│   │   │   │   └── fungi_odb10
│   │   │   └── download_and_extract.sh Shell script for downloading lineages
│   │   └── config Configuration files for BUSCO
│   │   └── config.ini.template Template file for BUSCO configuration
│   └── phix Reference files for PhiX
│   ├── phix174.fasta
│   ├── phix.amb
│   ├── phix.ann
│   ├── phix.bwt
│   ├── phix.pac
│   └── phix.sa
├── requirements Requirements for prerequisite python packages
└── scripts Dependent scripts including in this pipeline
├── busco.sh Script invoking BUSCO
├── Command.py Command class
├── Message.py Message class
├── n50.pl Perl script for computing N50 statistics
└── removePhiXreads.sh Script for removing PhiX contanimates from reads

Usage

type "fungae-caiq.py -h" for details
usage: fungae-caiq.py [-h] -s ILN_PE [ILN_PE ...]
-l ONT_data [ONT_data ...]
[--in INS,VAR [INS,VAR ...]]
[--conf CONF] [-o OUTPATH]
[-p PREFIX] [-t NUM_THREADS]
[-g GENOME_SIZE]
[-n LINEAGE]

hybrid assembly pipeline using MaSuRCA

optional arguments:
-h, --help show this help message and exit
-s ILN_PE [ILN_PE ...]
Illumina paired-end reads files
-l ONT_data [ONT_data ...]
ONT long reads files
--in INS,VAR [INS,VAR ...]
mean and variance of insert sizes (default:
2*readlen,100
--conf CONF designated configuration file (default:
/share/tools/fungae-caiq/fungae-caiq.conf)
-o OUTPATH Output path (default: '.')
-p PREFIX Prefix of files (default: 'sample')
-t NUM_THREADS number of threads (default: 12)
-g GENOME_SIZE estimated genome size (default: 45,000,000)
-n LINEAGE Lineage (fungi|eukaryota|bacteria) (default: fungi)

MongoDB Logo MongoDB