GAAS Code
Status: Beta
Brought to you by:
floflooo
File | Date | Author | Commit |
---|---|---|---|
lib | 2013-03-05 |
![]() |
[43f967] Version 0.17 |
script | 2013-03-05 |
![]() |
[e0fcae] Renamed script from 'GAAS' to 'gaas' |
t | 2011-07-19 |
![]() |
[e24a7c] Initial commit (GAAS version 0.16) |
utils | 2011-07-19 |
![]() |
[e24a7c] Initial commit (GAAS version 0.16) |
.gitignore | 2013-03-05 |
![]() |
[9b12a1] Added a .gitignore file |
Changes | 2013-03-05 |
![]() |
[43f967] Version 0.17 |
License | 2011-07-19 |
![]() |
[e24a7c] Initial commit (GAAS version 0.16) |
Makefile.PL | 2013-03-05 |
![]() |
[43f967] Version 0.17 |
README | 2013-03-05 |
![]() |
[144fca] README update |
Tutorial.txt | 2011-07-19 |
![]() |
[e24a7c] Initial commit (GAAS version 0.16) |
gaas-flowchart-small.gif | 2011-07-19 |
![]() |
[e24a7c] Initial commit (GAAS version 0.16) |
GAAS GAAS (Genome Abundance and Average Size) performs BLAST similarities search of metagenomic sequences against a database of complete genomes to estimate their relative abundance and average size. Can be used for any sort of complete sequences: genomes (viral, microbial, eukaryal), plasmids, genes, ... Results can be visually represented as phylogenic trees, size spectra and abundance piecharts. This program provides a command-line interface only. CITATION If you use GAAS in your research, please cite: Angly FE et al., The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes, PLoS Computational Biology 5, no. 12 (12, 2009): e1000593 INSTALLATION 1/ External dependencies: You need to install these dependencies first: * Perl (http://www.perl.com/download.csp) * NCBI BLAST v2 (http://www.ncbi.nlm.nih.gov/BLAST/download.shtml) At this point, you should be able to run BLAST by typing blastall in a terminal or command prompt. If this doesn't display the BLAST usage message, add the directory where you installed BLAST to your PATH environment variable. In Windows: Start menu > Control Panel > Performances & Maintenance > System > Advanced > Environment variables: Edit the PATH variable in the bottom window by adding the BLAST directory path, i.e. PATH = C:\Program Files\BLAST\bin;%PATH% 2/ Install GAAS: Case A/ If you downloaded the "standalone" GAAS version Skip to next step. Case B/ If you downloaded the regular GAAS version (CPAN module style) The following Perl modules are dependencies that are either provided in this package or will be installed automatically for you: * Getopt::Declare >= 1.13 * Math::Round * Math::Random::MT >= 1.16 * SVG::Parser * SVG::Graph * SVG::TT::Graph >= 0.16 * CSS::Tiny * Bio::Phylo >= 0.18 * Statistics::Descriptive::Weighted >= 0.5 * MLDBM * Win32::Symlink (if you use Windows) Note that installation of some modules will likely require the installation of a C compiler, which may not installed on your system if you use Windows. It should be done automatically for you, but if you encounter installation problems on Windows, try to get a compiler from here: http://www.bloodshed.net/dev/devcpp.html To install GAAS, run the following commands in a terminal or command prompt: On Linux, Unix, MacOS: perl Makefile.PL && make install On Windows: perl Makefile.PL && nmake install On Unix/Linux, if you do not have administrator rights and want to install the module locally into, say, ~/my/dir, try something along these lines: i/ Make sure that your CPAN configuration file ~/.cpan/CPAN/MyConfig.pm contains these entries: 'makepl_arg' => q[INSTALL_BASE=~/my/dir], 'mbuildpl_arg' => q[--install_base ~/my/dir], This will ensure that programs that you install through CPAN are installed locally. ii/ Make sure that your PERL5LIB and PATH environment variables are up-to-date echo 'export PERL5LIB=${PERL5LIB}:~/Bin/perl/lib' >> ~/.bashrc echo 'export PATH=${PATH}:~/Bin/perl/bin' >> ~/.bashrc source ~/.bashrc iii/ Install GAAS: perl Makefile.PL INSTALL_BASE=/my/dir make install 3/ Download data files: Data files for the analysis of viral, Bacterial, Archaeal and Eukaryal communities and tree files for the Viral Proteomic Tree and Tree of Life can be downloaded from http://biome.sdsu.edu/gaas/data/ DOCUMENTATION After installing GAAS, you can find out the program syntax by running the following command in a terminal or a command prompt: Case A/ If you downloaded the "standalone" GAAS version: Navigate to where the GAAS file is located and type: perl GAAS --help Case B/ If you downloaded the regular GAAS version: Simply type: GAAS --help The 'utils' folder included in the GAAS package contains a utility: * merge_tabular_BLAST_results: This tool takes tabular BLAST files generated by BLAST searches against several databases and merges the results to produce a files that looks like the results had been generated by comparison against a single database. MEMORY USAGE For large datasets, the amount of memory used by GAAS can grow quite large because GAAS needs to keep in memory information like the name of the sequences, their length, etc. If you are running out of memory when running GAAS, there are some solutions: 1/ Use the save_mem (-sm) option: This will save some of the information that would otherwise reside in memory on your harddrive. Thus, the amount of memory used by GAAS should be very low, but the GAAS computation will be somewhat slower. 2/ Use less memory intensive options: Using a taxonomy file, normalizing by genome length or filtering similarities by relative sequence length all increase memory consumption. Try skipping them or using alternative options that don't use sequence length. 3/ Use smaller database files: You might save memory by using database files that contain a smaller number of sequences. For example, if you were using a large database such as NCBI nt, you may try a database that contains less but better curated sequences, such as NBI RefSeq. USING ALTERNATIVE OR PARALLEL BLAST PROGRAMS To use alternative BLAST programs that are not called 'blastall' (and 'formatdb') or that need to be passed extra arguments (for example, the number of cluster nodes to use, modify the file GAAS.pm. In this file, locate the following lines and change them according to your needs, e.g.: 'formatdb_prog' => 'formatdb', # path to formatdb program 'formatdb_extra' => undef, # extra arguments for formatdb 'blastall_prog' => 'btbatchblast', # path to blastall program 'blastall_extra' => '--chunk 100', # extra arguments for blastall COPYRIGHT AND LICENCE Copyright 2009,2010,2011 Florent ANGLY <florent.angly@gmail.com> GAAS is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. GAAS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GAAS. If not, see <http://www.gnu.org/licenses/>. BUGS All complex software has bugs lurking in it, and this program is no exception. If you find a bug please email me at <florent.angly@gmail.com> so that I can make GAAS better. The GAAS source code is under Git revision control. Feel free to hack the code. To get started, do a : git clone git://gaas.git.sourceforge.net/gitroot/gaas/gaas