detectMITE is a MATLAB-based tool for detecting miniature inverted repeat transposable elements (MITEs) from genomes. This software is an open-source tool that follows specifications on website (http://creativecommons.org/licenses/by-nc-sa/3.0/).
If you have any question or comment, please contact with Dr. Chun Liang (liangc@miamioh.edu).
[1]. Ye C, Ji G, Liang C (2016) detectMITE: A novel approach to detect miniature inverted repeattransposable elements in genomes. Sci. Rep. 6, 19688
[2]. Ye C, Ji G, Li L, Liang C (2014) detectIR: A Novel Program for Detecting Perfect and Imperfect Inverted Repeats Using Complex Numbers and Vector Calculation. PLoS ONE 9(11): e113349.
[1]. Download and unzip our source codes - 'detectMITE.20170425.tar.gz'
After you download and unzip our source codes ('detectMITE.20170425.tar.gz'), you will see a folder named detectMITE, where all source codes and relevant documents are located.
[2]. Download and install software required by detectMITE
CD-HIT: http://weizhong-lab.ucsd.edu/cd-hit/
Download the source codes of cd-hit and put it in the folder of './detectMITE'. Then, install the cd-hit as follow,
$ tar xvf cd-hit-v4.6.1-2012-08-27.tgz ./
$ mv cd-hit-v4.6.1-2012-08-27 cd-hit
$ cd ./cd-hit
$ make
$ cd ..
$ mkdir result
[1]. To run detectMITE, you shoule make sure the MATLAB environment is installed.
[2]. After opening the MATLAB, change the MATLAB's Working Path to the path where detectMITE is installed (e.g., './detectMITE').
[3]. Type the following command in the Command Window of MATLAB:
tic;do_MITE_detection(data_file,'-genome','rice');runtime = toc;
However, we also provide an example script 'Test_Demo.m' for using in the source code, all your need is to modify few values of the parameters in the script, and then run it.
[4]. Alternatively, you can run the program in background by typing the following command at the Linux prompt (you aslo should change the Working Path to the path where detectMITE is installed):
$ nohup matlab < Test_Demo.m > output.txt &
data_file
The location of the genome file, for example, data_file = './data/rice_genome.fasta'.
The genome file should be in fasta format.
-tir_length
Define the minimum length of terminal inverted repeat (TIR), default value = 10.
-tsd_minimum_length
Define the minimum length of target site duplication (TSD), default value = 2.
-tsd_maximum_length
Define the maximum length of TSD, default value = 10.
-mite_minimum_length
Define the minimum length of MITE, default value = 50.
-mite_maximum_length
Define the maximum length of MITE, default value = 800.
-genome
Define a name for the genome, default name = 'genome'.
-cpu
Number of CPU to be used, default number = 1;
[1]. MITE representative sequences of each family are placed in the file 'genome_name.mite.fasta' in the folder './detectMITE'.
The description line of each representative sequence (e.g., chr06|2367817|2367964|10|2|109) has the following explanation:
>ChromosomeName|GenomicStartPosition|GenomicStopPosition|TIR_Length|TSD_Length|CopyNumber.
Please notice that the TIR length is computed with at most 1 tolerated mismatch. The actual TIR length may be longer than the specified with looser limitations.
[2]. MITE families are placed in the file 'genome_name.miteSet.fasta' in the folder './detectMITE'. And each family are seperated by dot line.
The description line of each MITE sequence (e.g., >chr06|2367817|2367964|10|2) has the following explanation:
>ChromosomeName|GenomicStartPosition|GenomicStopPosition|TIR_Length|TSD_Length.
[3]. Files in the folder './detectMITE/result/' are temporary files generated during the program running, they could be deleted.
To see the details of update history, go to https://sourceforge.net/p/detectmite/wiki/UpdateHistory/