Home
Name Modified Size InfoDownloads / Week
exome_filter.sh 2021-02-09 9.7 kB
README 2018-01-15 2.0 kB
ExAC.id.gz 2017-09-11 7.1 MB
Totals: 3 Items   7.1 MB 1
# exome_filter.sh
# Author: Chin-Chen Pan
# Directore, General and Surgical Pathology
# Professor, attending pathologist
# Department of Pathology and Laboratory Medicine
# Taipei Veterans General Hospital
# TAIWAN
# Version 2.1.1
# Date: Jan 15, 2017


[Introduction]

A script to filter exome-seq by 1000G, ExAc, dbSNP with minimal coverage and T/N ratio.
The script uses the files produced by exome_test.sh.
The filter process includes:
	Variants accepted if called by both HaplotypeCaller and Varscan.
	If called by either one caller, the filter includes: >minimal coverage, >minimal T/N ratio
	Then all variants filtered by non-1000g, non-ExAC and non-dbSNP.

[Before running]

1. Prepare exome_test.config. The file contains four words in one line. No other words and lines are allowed.

	/path/to/programs /path/to/inputfile /path/to/outputfile thread_number

	ex1: 
	/home/user_name	/media/user_name/disk1/input /home/user_name/output 8

	ex2:
        ~  ~/input ~/output 8

2. The followingsfolders must be placed in the /path/to/programs.

	ExAC.id
	   
   ExAC.id is a file with variations of MAF<0.01%. It can be produced by using ANNOVAR file or from ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz downloaded from ExAC site (the former is better).

	cat /path/to/annovar/humandb/hg19_exac03nontcga.txt | sed '1d' | awk -F '\t' '$6>0.0001 {print $1 "-" $2 $4 "-" $5 "\t" $6}' | sed '/e-/d' | sed 's/^/chr/g' | sort -t "`echo -e "\t"`" -u -k1,1 > ExAC.id
	
	zcat ./ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz | sed '/^#/!s/^/chr/g' | sed 's/chrMT/chrM/g' | sed '/AF=.....e-05/d' | awk -F '\t' ' {for (i=1; i<=split($5, a, ","); ++i) print $1 "-" $2 $4 "-" a[i] }' | sort -u > ./ExAC.id



[RUNNING]

Syntax:  sh exome_filter.sh sample_name(requird) minimal coverage(requird) minimal T/N ratio(required)
	
	options:
        -s: shutdown after finished
	-kt: keep temporary files


	ex:
	  sh exome_filter.sh T1 20 0.2 -kt


Source: README, updated 2018-01-15