| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| TAMeBS_Supplement.pdf | 2014-09-16 | 116.2 kB | |
| TAMeBS_v0.4.tar.gz | 2014-09-08 | 49.4 kB | |
| README_TAMeBS.txt | 2014-09-07 | 3.1 kB | |
| Totals: 3 Items | 168.7 kB | 0 |
Installation Uncompress the package: tar -xvzf TAMeBS_v0.4.tar.gz Go to installation directory: cd TAMeBS_v0.4 Install: make Building Index If 'build' does not appear as an executable file, add execute permission first: chmod +x build To build index for reference genome: ./build <genome.fa> # All chromosomes should be located in one FASTA file. After building, <genome.fa> will be reformated into a new file <lin_genome.fa>. The new file will be used in the subsequent alignment process. Alignment Mapping BS-reads onto reformated genome: ./tamebs_approx [para] (-o <Temp_Output_file.tab>) <bs_reads> <lin_genome.fa> # Input reads file can be in FASTA or FASTQ format. # If do not specify the <Temp_Output_file.tab>, the default file named "AlignTmpx.tab" will be output. Reset alignments onto original genome: ./tamebs_merge (<Temp_Output_file.tab>) <FINAL_OUTPUT.tab> <genome.fa> # FINAL_OUTPUT.tab is the output file that records the alignment information. Parameters: -k (int) the maximum number of errors per read allowed in alignment. -B (int) output alignments whose mapping score >=B (default: 0). -m (int) score for match (default: 6). -mm (int) score for mismatch (default: -18). -ub output only the uniquely best alignments. -ab output any best alignments. -al output all candidate alignments. -h help. Alignment output file format The alignment is finally writen in a tabular file. Each row has the following format: <read-id> <#err> <#chr> <Lchr> <strand> <pos_in_G> <read_seq> <aln> <score> Example: 100 2 4 1000000 0 2014 AAACTGGGGCGGGGGA mmmVummmsUmmsmmn 38 Explanation: The 100th read AAACTGGGGCGGGGGA is aligned to the reference genome with 2 mismatches. The alignment locates at position 2014 on the +FW strand of the chromosome 4, whose total length is 1000000bp. The alignment achieve mapping score 38. strand: 0 (+FW), 1 (+RC), 2(-FW), 3(-RC). pos_in_G: 0-based location. aln: m(match), s(mismatch), n(low base-calling quality), U(methylated C in CpG), V(methylated C in CHG), W(methylated C in CHH), u(unmethylated C in CpG), v(unmethylated C in CHG), w(unmethylated C in CHH). Methylation Estimation ./tamebs_mC <FINAL_OUTPUT.tab> Output files: 1. FINAL_OUTPUT_mC_record.tab 2. Summary_FINAL_OUTPUT.tab The first file records the methylation level at every cytosine covered by our program. The order of cytosines is based on their positions in reference genome. The file has the following format: <pos_in_G> <strand> <chromosome> <context> <methylation level> Here the strand has only two cases: '+'(forward) and '-'(reverse complement). The second file summarizes the methylation information of the sample genome. It contains the total number of Cs analyzed by our programe, the total number of methylated Cs, the total methylation percentage in every genomic context, etc. Execution Example Reference genome: A_th.fa Reads file: bsreads.fastq Run TAMeBS: 1. ./tamebs_approx -k 5 bsreads.fastq lin_A_th.fa 2. ./tamebs_merge TAMeBS_out.tab A_th.fa 3. ./tamebs_mC TAMeBS_out.tab