Name | Modified | Size | Downloads / Week |
---|---|---|---|
SInC_readme.txt | 2014-08-22 | 2.2 kB | |
SInC_readGen.c | 2014-01-20 | 14.8 kB | |
SInC_simulate.c | 2014-01-20 | 57.1 kB | |
genProfile.c | 2014-01-20 | 3.2 kB | |
Totals: 4 Items | 77.3 kB | 0 |
Citation SInC: An accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data. Swetansu Pattnaik, Saurabh Gupta, Arjun A Rao and Binay Panda, 2014 How to compile: -> gcc -o genProfile genProfile.c -> gcc -o SInC_simulate SInC_simulate.c -lm -lgsl -lgslcblas -> gcc -o SInC_readGen -O2 SInC_readGen.c -lgsl -lgslcblas -lpthread SInC has 3 modules: Module 1: Quality profile generation Run "./genProfile" to generate quality profile from your desired input file. Usage: ./genProfile -R <read tag(1 for R1, 2 for R2)> -l <read length> <input.txt> Example: ./genProfile -R 1 -l 100 input.txt -> -R 1 means profile for R1.(Similarly run with -R 2 for R2) -> -l 100 means read length 100 -> input.txt this file contains list of fastq files(1 file per line) to be used for error profile generation. NOTE: 1. Currently genProfile can process only Phred+33 fastq files. Module 2: Simulation of SNPs, INDELs, CNVs Run "./SInC_simulate" to simulate SNPs, INDELs, CNVs. Usage: ./SInC_simulate [options] <in.ref.fa> Example: ./SInC_simulate -S 0.002 -I 0.0001 -p 2 -l 1000 -u 150000 -t 2 -> -S 0.002 means 0.002% of SNPs to be incorporated in the reference -> -I 0.0001 means 0.0001% of INDELs to be incorporated in the reference -> -p 2 means 2% of CNVs to be incorporated in the reference -> -l 1000 means minimum size of CNV should be 1000 -> -u 150000 means maximum size of CNV should be 150000 -> -t 2 means ti/tv should be 2 NOTE: 1. SInC will generate fasta file for both the alleles, so run read generator on both the files separately. 2. Minimum evolutionary SNP rate is set to 0.0033. Module 3: Read generation Run "./SInC_readGen" for both the fasta files generated in Step 2. Usage: ./SInC_readGen [options] <in.ref.fa> <read_1_profile.txt> <read_2_profile.txt> Example: desired coverage 10X ./SInC_readGen -C 5 -T 1 -R 100 chr22_allele_1.fa 100_bp_read1_profile.txt 100_bp_read2_profile.txt ./SInC_readGen -C 5 -T 1 -R 100 chr22_allele_2.fa 100_bp_read1_profile.txt 100_bp_read2_profile.txt -> -C 5 means fold coverage for chr22_allele_1.fa is 5 -> -T 1 means use 1 core -> -R 100 means read length 100