Download Latest Version TAMeBS_v0.4.tar.gz (49.4 kB)
Email in envelope

Get an email when there's a new version of TAMeBS

Home
Name Modified Size InfoDownloads / Week
TAMeBS_Supplement.pdf 2014-09-16 116.2 kB
TAMeBS_v0.4.tar.gz 2014-09-08 49.4 kB
README_TAMeBS.txt 2014-09-07 3.1 kB
Totals: 3 Items   168.7 kB 0
Installation

Uncompress the package:

tar -xvzf TAMeBS_v0.4.tar.gz

Go to installation directory:

cd TAMeBS_v0.4

Install:

make
Building Index

If 'build' does not appear as an executable file, add execute permission first:
chmod +x build

To build index for reference genome:

./build <genome.fa>
# All chromosomes should be located in one FASTA file.
After building, <genome.fa> will be reformated into a new file <lin_genome.fa>. 
The new file will be used in the subsequent alignment process.

Alignment

Mapping BS-reads onto reformated genome:

./tamebs_approx [para] (-o <Temp_Output_file.tab>) <bs_reads> <lin_genome.fa>

# Input reads file can be in FASTA or FASTQ format.
# If do not specify the <Temp_Output_file.tab>, the default file named "AlignTmpx.tab" will be output.

Reset alignments onto original genome:

./tamebs_merge (<Temp_Output_file.tab>) <FINAL_OUTPUT.tab> <genome.fa>

# FINAL_OUTPUT.tab is the output file that records the alignment information.

Parameters:
-k (int) the maximum number of errors per read allowed in alignment.
-B (int) output alignments whose mapping score >=B (default: 0).
-m (int) score for match (default: 6).
-mm (int) score for mismatch (default: -18).
-ub output only the uniquely best alignments.
-ab output any best alignments.
-al output all candidate alignments.
-h help.

Alignment output file format

The alignment is finally writen in a tabular file. Each row has the following format:
<read-id> <#err> <#chr> <Lchr> <strand> <pos_in_G> <read_seq> <aln> <score>

Example:
100 2 4 1000000 0 2014 AAACTGGGGCGGGGGA mmmVummmsUmmsmmn 38

Explanation:
The 100th read AAACTGGGGCGGGGGA is aligned to the reference genome with 2 mismatches. 
The alignment locates at position 2014 on the +FW strand of the chromosome 4, 
whose total length is 1000000bp. The alignment achieve mapping score 38.
strand: 0 (+FW), 1 (+RC), 2(-FW), 3(-RC).
pos_in_G: 0-based location.
aln: m(match), s(mismatch), n(low base-calling quality), U(methylated C in CpG), 
V(methylated C in CHG), W(methylated C in CHH), u(unmethylated C in CpG), 
v(unmethylated C in CHG), w(unmethylated C in CHH).


Methylation Estimation

./tamebs_mC <FINAL_OUTPUT.tab>

Output files:
1.	FINAL_OUTPUT_mC_record.tab
2.	Summary_FINAL_OUTPUT.tab

The first file records the methylation level at every cytosine covered by our program. 
The order of cytosines is based on their positions in reference genome. The file has 
the following format:
<pos_in_G> <strand> <chromosome> <context> <methylation level>

Here the strand has only two cases: '+'(forward) and '-'(reverse complement).

The second file summarizes the methylation information of the sample genome. 
It contains the total number of Cs analyzed by our programe, the total number of methylated Cs, 
the total methylation percentage in every genomic context, etc.


Execution Example

Reference genome: A_th.fa
Reads file: bsreads.fastq

Run TAMeBS:

1.	./tamebs_approx -k 5 bsreads.fastq lin_A_th.fa
2.	./tamebs_merge TAMeBS_out.tab A_th.fa
3.	./tamebs_mC TAMeBS_out.tab

Source: README_TAMeBS.txt, updated 2014-09-07