Home
Name Modified Size InfoDownloads / Week
SInC_readme.txt 2014-08-22 2.2 kB
SInC_readGen.c 2014-01-20 14.8 kB
SInC_simulate.c 2014-01-20 57.1 kB
genProfile.c 2014-01-20 3.2 kB
Totals: 4 Items   77.3 kB 0
Citation
SInC: An accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.
Swetansu Pattnaik, Saurabh Gupta, Arjun A Rao and Binay Panda, 2014

How to compile:
-> gcc -o genProfile genProfile.c
-> gcc -o SInC_simulate SInC_simulate.c -lm -lgsl -lgslcblas
-> gcc -o SInC_readGen -O2 SInC_readGen.c -lgsl -lgslcblas -lpthread

SInC has 3 modules:

Module 1: Quality profile generation
Run  "./genProfile" to generate quality profile from your desired input file.

Usage:
	./genProfile -R <read tag(1 for R1, 2 for R2)> -l <read length> <input.txt>

Example:
./genProfile -R 1 -l 100 input.txt

-> -R 1 		means profile for R1.(Similarly run with -R 2 for R2)
-> -l 100 		means read length 100
-> input.txt	this file contains list of fastq files(1 file per line) to be used for error profile generation.

NOTE:
1. Currently genProfile can process only Phred+33 fastq files.

Module 2: Simulation of SNPs, INDELs, CNVs
Run "./SInC_simulate" to simulate SNPs, INDELs, CNVs.

Usage:
	./SInC_simulate [options] <in.ref.fa>

Example:
./SInC_simulate -S 0.002 -I 0.0001 -p 2 -l 1000 -u 150000 -t 2

-> -S 0.002		means 0.002% of SNPs to be incorporated in the reference
-> -I 0.0001	means 0.0001% of INDELs to be incorporated in the reference
-> -p 2			means 2% of CNVs to be incorporated in the reference
-> -l 1000		means minimum size of CNV should be 1000
-> -u 150000	means maximum size of CNV should be 150000
-> -t 2			means ti/tv should be 2

NOTE:
1.	SInC will generate fasta file for both the alleles, so run read generator on both the files separately.
2.	Minimum evolutionary SNP rate is set to 0.0033.

Module 3: Read generation
Run "./SInC_readGen" for both the fasta files generated in Step 2.
Usage:
	./SInC_readGen [options] <in.ref.fa> <read_1_profile.txt> <read_2_profile.txt>

Example: desired coverage 10X
./SInC_readGen -C 5 -T 1 -R 100 chr22_allele_1.fa 100_bp_read1_profile.txt 100_bp_read2_profile.txt
./SInC_readGen -C 5 -T 1 -R 100 chr22_allele_2.fa 100_bp_read1_profile.txt 100_bp_read2_profile.txt

-> -C 5		means fold coverage for chr22_allele_1.fa is 5
-> -T 1		means use 1 core
-> -R 100	means read length 100
Source: SInC_readme.txt, updated 2014-08-22