Download Latest Version metatea.zip (140.9 MB)
Email in envelope

Get an email when there's a new version of metatea

Home
Name Modified Size InfoDownloads / Week
README.txt 2022-11-28 11.0 kB
metatea.zip 2022-11-28 140.9 MB
metatea.png 2022-11-28 57.9 kB
Totals: 3 Items   141.0 MB 0
METATEA Manual

1. Goal of METATEA development
¡Û METATEA was developed to support 16S rRNA-based analyses at sequence level using both sequence-supervised and unsupervised modes for rapid and accurate comparison across samples and groups.

¡Û The sequence-supervised approaches using bacterial sequences from healthy persons and IBD patients were developed. However, you should use the same primers and sequencing platform for the sequence-supervised approaches.

¡Û Exclusive correlation value (ECV) was defined for the detection of mutually exclusive relationships between bacteria like probiotics and pathogens that are antagonistic each other. You can draw ECV network using ECV matrix from METATEA.

¡Û I have been developing METATEA. If many persons use METATEA and request new functions, more functions will be added in the near future. 


2. Essential requirements
- Operating system
  -- Linux (METATEA was designed to be used in 32 and 64 bit Debian, Ubuntu, and RedHat. But it was tested only in 64 bit RedHat.) 
  -- Windows

3. How to use METATEA (refer Basic pipeline)
- Download and decompress METATEA zip file
 -- METATEA is available from https://sourceforge.net/projects/metatea 
 -- Unzip METATEA (unzip metatea.zip)
- Type commands
 -- metatea(version) mode file(s) 

4. Primer and sequencing platform (Sequence-supervised mode only)
- METATEA sequence-supervised mode only analyzes 16S V4 region (For other regions, you should use METATEA unsupervised mode).
- The primer sequences are popularly used in Human Microbiome Project and Earth Microbiome Project.
 -- FWD:GTGCCAGCMGCCGCGGTAA; REV:GGACTACHVGGGTWTCTAAT 
- Sequencing platform and protocols: Illumina sequencing platforms (We tested samples using protocols for Illumina MiSeq V4 Broad Institute.; Refer PRJEB13679)

5. METATEA mode
- qcis: Quality control for Illumina single reads
-  fq2fa: Transformation of fastq into fasta file format
-  ncor: Correction of ambiguous bases (Ns) in pyrosequencing reads.
-  nrem: Removal of reads with Ns.
-  removes: Removal of short reads
-  decontam/deconfile: Removal of humal DNA
-  triml: Trimming long reads
-  ti3200: Proportions of 16S rRNA gene sequences using sequence-supervised mode
-  unique: Extraction of unique sequences (Removal of duplicated sequences)
-  uniqpro: Proportions of your sequences
-  normal: Adjustment of proportions of sequences
-  shannon: Calculation of Shannon indexes
-  freuniq: Extraction of frequent sequences
-  id2seq: Extraction of sequences from ids
-  seqno: The number of existing individual sequences
 -- Do not directly use no. of individual sequences to indicate microbial diversity since sequencing errors can overestimate microbial diversity.
-  fre50: The most frequent 50 sequences
-  utest: Multiple z-values in Mann-Whitney U test
-  exclusive: Exclusive correlation coefficient matrix for the R qgraph package
-  mat2col: Transformation of an exclusive correlation coefficient matrix into a column.

6. Basic pipeline
-  Generally recommended pipeline: Unsupervised analysis using multiple samples

[1] Equipment setup
- METATEA is available in the SourceForge repository (https://sourceforge.net/projects/metatea). 
- Decompress the METATEA zip file. If required, one can add the METATEA folder to one¡¯s PATH environment variable. 

- In the windows environment,
 -- right-click the computer icon and choose Properties. 
 -- Choose System properties, choose Advanced, and click Environment Variables. 
 -- Click Path, click Edit, type your METATEA folder similar to the following command, and press OK: C:\Users\UserID\Downloads\metatea\window

- In the Linux environment, 
 -- one can add the METATEA folder to one¡¯s PATH environment variable by typing command similar to the following command: $ export PATH=$PATH:$HOME/YourID/metatea/linux
  -- If he or she is not familiar with setting environment variables, one can directly work in the METATEA folder.

[2] Quality control
1) After downloading and environmental setup, move one¡¯s fastq files into the window\ folder inside the metatea\ folder. Use the command below for changing the working folder to METATEA folder.
  > cd C:\Users\UserID\Downloads\metatea\window\ 
2) Perform quality control using the command below.
  > .\metatea.exe qcfq yourfile.fq 
3) Covert the fastq file to fasta file using the command below.
  > .\metatea.exe fq2fa yourfile.fqqc.fq 
4) Corrects Ns in pyrosequencing reads using the command below. If the sequencing was performed using other platforms, skip this step.
  > .\metatea.exe ncor yourfile.fqqc.fq.fa
5) Remove reads with Ns using the command below. Please reminded that specific sequencing platforms do not produce Ns.
  > .\metatea.exe nrem yourfile.fqqc.fq.fa 
6) Remove too short reads. This step is optional but strongly recommended. Change the threshold (bp) by adding len_ as shown in the following command.
  > .\metatea.exe removes yourfile.fqqc.fq.fanr.fa len_175
7) Trim too long reads. This step is optional but strongly recommended. Change the threshold (bp) by adding len_ as shown in the following command.
  > .\metatea.exe triml yourfile.fqqc.fq.fanr.fars.fa len_175
8) If the host DNA sequences are known, perform DNA decontamination using the command below.
  > .\metatea.exe deconfile yourfile.fqqc.fq.fanr.fars.fatl.fa hostDNA.fa
9) One may remove intermediate files that have been produced during the quality control steps and rename the last file. Select the file and click the Rename button.

[3] Calculation of sequential proportions
1) After renaming the file name, extract unique sequences using the command below. 
  > .\metatea.exe unique renamed1.fa
2) If one wants to know the number of unique sequences in a single file, type the following command after extracting unique sequences from a single file.
  > .\metatea.exe seqno renamed1.faunique.fa 
3) Calculate proportions of unique sequences using the command below.
  > .\metatea.exe uniqpro renamed1.faunique.fa renamed1.fa
4) Filter out rare sequences. Change the threshold (proportion) by adding threshold_ as shown in the following command.
  > .\metatea freuniq renamed1.fapro.txt threshold_0.001
   $$$CRITICAL STEP$$$ We recommend that the threshold should be determined according to the number of reads in the file since putative erroneous reads might generally occur once in a single file in this study. Generally, threshold can be determined as
Threshold=1/(the smallest number of reads among your files-1). But one may have to use greater proportions depending on further analyses since too big data often leads to errors when you use statistical tools.
5) If multiple groups/samples are to be analyzed, extract unique sequences of high proportions using the command below.
  > .\metatea.exe id2seq renamed1.fapro.txt0.001.txt renamed1.faunique.fa
6) If multiple groups/samples are to be analyzed, merge the unique sequence fasta files using the command below. 
  > cat renamed1.fapro.txt0.001.txt.fa, renamed2.fapro.txt0.001.txt.fa | sc merged.fa
7) If multiple groups/samples are to be analyzed, extract unique sequences from the merged unique sequence file using the command below. 
  > .\metatea.exe unique merged.fa 
8) Calculate proportions of unique sequences using the command below.
  > .\metatea.exe uniqpro merged.faunique.fa renamed1.fa 
9) Merge the proportional data in a text file by simply adding data as a new line. 
10) Adjust the sum of proportions to 1 since we assume that sequencing errors caused most undetected reads.
  > .\metatea.exe normal mergedproportion.txt 
11) One may remove intermediate files that have been produced during the quality control steps and rename the last file. Select the file and click the Rename button.

[4] Statistical analysis
1) After renaming the proportion file, calculate Shannon indices using the command below. 
  > .\metatea.exe shannon proportion.txt
2) Perform Mann-Whitney U tests. Type the number of sequences, range of the first group, and range of the second group as shown in the following command (100 sequences, from the first row to 40th row, and from the 41st row to 80th row).
  > .\metatea.exe utest proportion.txt 100 1_40 41_80
3) Produce ECV matrix using the command below. One can perform network analysis from the ECV matrix using statistical tools like R.
  > .\metatea.exe exclusive proportion.txt

7. Difference between METATEA and other tools
METATEA does not support a lot of modes yet, but we will develop more modes. Basically many users can easily install and use Mothur in different operating systems, but it does not support analyses based on identical sequence matching. QIIME2 supports various steps to analyze microbial community members, but it cannot be installed in some servers due to the limitation of environmental managers. If you have problems in the installation of QIIME2, METATEA can be the alternative. 


8. Difference between ECV, PCC, and SCC
Although Pearson correlation coefficient (PCC) can be one of the robust means of finding co-occurring bacteria in terms of positive linear correlation, mutually exclusive relationships in bacterial proportions appear to be inversely proportional, rather than being negatively linear. In addition, Spearman¡¯s correlation coefficient (SCC) could identify weak mutually exclusive relationships, but it could not distinguish strong mutually exclusive relationships from weak exclusive relationships. Strong exclusive relationships were selectively identified using only ECV. 

9. Difference between LEfSe and Mann Whitney U test
Actually, I do not know about LEfSe analysis in QIIME. However, some persons may want to know their differences between LefSe in QIIME and U in METATEA. I guess below differences. Both tests may require many samples for confidence (at least 5 samples for 1 group). Generally, LEfSe may be integrative and superior. 

10. Contact details
I welcome post questions, recommendations, and bug reports to the METATEA wiki (https://sourceforge.net/p/metatea/wiki/Home/). But I prefer E-mails. Contact Sunguk with questions, comments, requests, complaints, suggestions, and recruitment.
Sunguk Shin (bestfa@naver.com)

11. Terms of use
METATEA has been developed at JFK lab, Yonsei University, Seodaemoon-gu, Seoul, Republic of Korea. METATEA is presently free software. In no event will JFK lab be liable to you for any damage. METATEA can be used, redistributed and/or modified freely for non-commercial purposes subject to the original source is properly cited. JFK lab is not responsible for all medical mistakes related with METATEA. THIS SOFTWARE IS PROVIDED ¡°AS IS¡± AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. Use of this software is taken as an agreement to these terms of usage.
Source: README.txt, updated 2022-11-28