grasshopper Code

Brought to you by: archizaak, jbadura, wf-grasshopper

Tree [6398ad] master / History

HTTPS access

File	Date	Author	Commit
grasshopper-build	2019-12-23	Wojtek Frohmberg	[487f91] BUGFIX for cuda-10.2 toolkit
grasshopper-correct	2019-06-08	Wojciech Frohmberg	[2afba5] Bugfix in correction
grasshopper-preprocess	2020-02-24	Wojtek Frohmberg	[8c10cc] BUGFIX in preprocessing phase (fastamerger).
grasshopper-scripts	2019-12-28	Wojtek Frohmberg	[eddffd] Modifying the installation script for root user
grasshopper-traverse	2019-02-24	Wojciech Frohmberg	[5949ea] Fork sensitivity is now double
grasshopper-trim	2018-01-31	Wojciech Frohmberg	[8703e7] RM: script to run trimming is now in scripts di...
.gitignore	2018-10-11	Jan Badura	[f2f45b] Add .idea to gitignore
README	2019-02-22	Wojciech Frohmberg	[bc3096] GNU tar added to dependencies.
grasshopper	2018-01-31	Wojciech Frohmberg	[b785d6] ADD: main grasshopper script

Read Me

--------------------------
|    PRE-REQUIREMENTS    |
--------------------------

- bash (>=4.1.2)
- GNU coreutils (>=8.4)
- GNU sed (>=4.2.1)
- GNU make (>=3.81)
- GNU which (>=2.19)
- GNU Awk (>=3.1.7)
- g++ supporting c++14 standard (>=4.9.3)
- flex (>=2.5.35)
- bison (>=2.5)
- trimmomatic (>=0.32)
- bowtie (==1.*)
- bowtie (==2.*, recommended 2.0.2)
- samtools (>=1.6)
- quast (>=4.5)
- seqan (>=2.3.2)
- java-runtime-environment (>=1.7.0)
- nvidia-cuda-toolkit (>=8.0.61)
- GNU tar (>=1.23)

Alternatively one of the following to perform scaffolding:
- soapdenovo (>=2.04)
- sspace (==Basic 2.0)

TRIMMOMATIC_PATH environment variable should be set to trimmomatic root

--------------------------
|      COMPILATION       |
--------------------------

run: ./grasshopper compile



--------------------------
|      INSTALLATION      |
--------------------------

For non sudo user run: ./grasshopper install
NOTE: be sure that ${HOME}/bin is in your ${PATH}

For sudo user run: sudo ./grasshopper install
NOTE: be sure that /usr/local/bin is in users' ${PATH}



--------------------------
|         USAGE          |
--------------------------

Grasshopper consists of six steps: preprocess, build, traverse, 
correct, trim and scaffold. To run it use:

grasshopper <step-name> [params...]

first step namely preprocess creates the dataset which is the 
directory in which all the files of given run are stored. 
All the other steps uses it as the only non-optional parameter.

The simplest grasshopper usage can look as follows:

grasshopper preprocess foo-1.fastq foo-2.fastq
grasshopper build foo
grasshopper traverse foo
grasshopper correct foo
grasshopper trim foo
grasshopper scaffold foo

The name of dataset for given example will be extracted from the reads name. 
Path for dataset to be stored (unless user specify it explicitly using -ds 
option) is:

${HOME}/grasshopper-data/<dataset-name>

Each step adds consecutive informations to the dataset so you cannot run steps 
with random order. However you can run given step another time e.g. with 
another set of parameters without re-running each previous step.
If you plan to do so remember that the files in the dataset will be overwritten
so back-them-up if you don't wish to loose them.

preprocess
-ds=<dataset> -- dataset  name/path (default: name of the fasta/fastq files)
-sg=<similar-genome> -- (optional) genome file in fasta format to filter reads
-trimpath=<path-to-trimmomatic> -- alternative to TRIMMOMATIC_PATH environment variable
-trimparams=<trimmomatic-params> -- to alter the default Trimmomatic parameters

build
-ws=<window-size> -- sets window size (default: 600)
-sc=<score-cutoff> -- sets score cutoff (default: 50)
-e=<allowed-errors> -- sets tolerance on errors between two reads (default: 0)
-kmer=<k-mer-size> -- sets size of k-mer to compute characteristics (default: 6)
-pc=<characteristics-count> -- count of partial characteristics (default: 3)
-ps=<characteristics-size> -- the size of a single partial characteristic (default: 50)
-awa=<TRUE/FALSE> -- enables enhancement that drastically increases search set of promising pairs to be verified (may be time consuming!) (default: FALSE)
-sli=<size> -- the size of a shortest lexicographical index sequence (default: 20)

traverse
-fs=<forks-sensitivity> – sets the sensitivity of the forks detector (default: 6)

correct
-minconf=<value> – sets tolerance on distant paired-end reads (default: contigs_depth/7)
-maxrefs=<value> – sets the maximum number of distant paired end reads that after being exceeded we consider to add a new cut spot (default: contigs_depth/7)

trim
<no parameters>

scaffold
-m=<sspace/soap2> – chooses scaffolding tool (default: sspace)
-is=<insert-size> – sets insert size of the original paired-end reads file

grasshopper Code

Branches

Tree [6398ad] master / Download Snapshot History

Read Me

Tree [6398ad] master /

History