## Best practices with SNPTools

Step 1: EBD calcualtion

Input: BAM files
Output: EBD files

```for i in `cat \$BAM_list` ; do
echo "pileup \$i" | msub -q analysis -d \$PWD -V -l nodes=1:ppn=1,mem=10000M;
done
```

Step 2: SNP calling

Input: EBD files, reference file, chromosome number
Output: SNP calls in VCF file

```echo "varisite \$EBD_list \$Chr ~/reference/human_b37/"\$Chr".fa " | msub -q analysis -d \$PWD -V -l nodes=1:ppn=8
```

Step 3: Genotype likelihood calculation

Input: Sample list, BAM files, SNP calls in VCF file
Output: GL in RAW files

Note 1: the following script calculates genotype likelihood using all the BAM files of the same sample
Note 2: remember to check the position order in the VCF file.

```for sample in  `cat \$Sample_list`; do
echo "bamodel \$sample \$VCF `grep \$sample \$BAM_list | tr '\n' ' '`" | msub -q analysis -d \$PWD -V -l nodes=1:ppn=1,pmem=8192M -N \$sample;
done;
```

Step 4: Combine GL of each individual to one file

Input: RAW files, SNP calls in VCF,
Output: Prob file

```echo "poprob \$VCF \$RAW_list \$Prob -b 25600" | msub -q analysis -d \$PWD -V -l nodes=1:ppn=8,mem=30G
```

Step 5: Divide Prob files in bins to parallel imputation

Input: Prob file, chromosome number
Output: Bin files

```echo "probin \$Prob \$Chr -f \$Bin_directory" | msub -q analysis -d \$PWD -V
```

Step 6: Imputation

Input: Bin files in list
Output: imputation result of each bin will be generated in the same directory of Bin

Note: to make use of multiply CPUs of the nodes, it is suggested to split the Bin list in different parts, and then submit the imputation job of each part to different nodes

```for i in `ls part.*`; do
echo "impute -l \$i " | msub -q analysis -d \$PWD -V -l nodes=1:ppn=4;
done
```

Step 7: Bind haplotype blocks together in one VCF

Input: Bin directory
Output: genotype/haplotype in VCF

```echo "hapfuse \$VCF \$Bin_directory" |msub -q analysis -d \$PWD -V
```

Appendix: Convert binary GL file to VCF format

Input: Prob file created from step 4
Output: VCF file of population

```echo "prob2vcf in.prob out.vcf.gz chr" | msub -q analysis -d \$PWD -V
```