Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
run_SVseq2_example.sh | 2014-07-10 | 155 Bytes | |
Clever_map_genotyping_illumina_trio_data.sh | 2014-07-10 | 1.0 kB | |
run_pindel.sh | 2014-07-10 | 294 Bytes | |
GenomeStrip_3_genotyping.sh | 2014-07-10 | 1.9 kB | |
GenomeStrip_1_preprocess.sh | 2014-07-10 | 2.4 kB | |
Totals: 5 Items | 5.7 kB | 0 |
1.Feature collection: 1.1 Option description Usage: ./GINDEL [options] Required options: -D/I call genotype of deletions/insertions -r FILE reference file(indexed) -i FILE input bam file(index and sorted) list -p FILE deletion/insertion positions(sorted and non-overlap) in vcf format. -o FILE output file name -l INT read length Optimal options -s INT slack value for split position with default 15 -b .bas file is provided -m DOUBLE mean insert size -v DOUBLE standard variation of insert size -g FILE genotype file in vcf format for training data -h help 1.2 Example: For deletion: (1)Collect Features with given constant insert size: ./GINDEL -D -r ./human_g1k_v37.fasta -i ./simulated_data_input.6.4.list -p ./simulation_del_sites.vcf -o test_sim.txt -l 100 -m 400 -v 50 (2)Collect Features with given insert size contained in .bas file: ./GINDEL -D -r ./human_g1k_v37.fasta -i ./simulated_data_input.6.4.list -p ./simulation_del_sites.vcf -o test_sim.txt -l 100 -b (3)Collect Features without insert size: ./GINDEL -D -r ./human_g1k_v37.fasta -i ./simulated_data_input.6.4.list -p ./simulation_del_sites.vcf -o test_sim.txt -l 100 Get Trained data: ./GINDEL -D -r ./human_g1k_v37.fasta -i ./simulated_data_input.6.4.list -p ./simulation_del_sites.vcf -o test_sim.txt -l 100 -g ./genotype.vcf For insertion: ./GINDEL -I -r ./human_g1k_v37.fasta -i ./simulated_data_input.6.4.list -p ./simulation_ins_sites.vcf -o test_sim.txt -l 100 2. Training and prediction 2.1. Training python easy.py training_data 2.2 Predicting ..\windows\svm-predict data.scale Trained.model predictResult Contact: chong.chu@engr.uconn.edu