GARM Meta assembler - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size
EXAMPLE	2011-06-14
LICENSE.txt	2016-12-20	1.9 kB
GARM_v0.7.5.tar.gz	2015-05-05	31.2 MB
README.txt	2015-05-05	5.5 kB
GARM_v0.7.4.tar.gz	2015-01-09	31.0 MB
GARM_v0.7.3.tar.gz	2014-08-19	19.6 MB
GARM_v0.7.tar.gz	2011-06-14	70.7 kB
Totals: 7 Items		81.9 MB

GARM README

INTRODUCTION
GARM (Genome Assembly, Reconciliation and Merging) is a pipeline of several scripts to combine results from different genome assemblies. The results (contigs and/or scaffolds) can come from different sequencing technologies (capillary, 454, Illumina, SOLiD, etc) libraries or just different programs using the same sequencing technology. 

REQUIREMENTS:

Since version 0.7.3, the necessary third-party software* is already included. Please, configure your enviromental variables (described below) pointing to the locations included in the package. If you try a different version and works, please let us know!. 

-Perl v5.8 or above
-Perl Modules: 
	Parallel::ForkManager
	List::MoreUtils
The best way to install a perl module would be CPAN
-AMOS (tested on v3.0.0) software (http://sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS)
-MUMmer (tested on v3.22) software (nucmer) (http://mummer.sourceforge.net/)
-R language (for GC plots only)

Please, make sure that everything is installed and running properly.

Depending on which shell you have, please add these environmental variables:

If you have tcsh then in your .cshrc add:

setenv GARMBIN /path/where/you/installed/GARM_version/bin
setenv GARMLIB /path/where/you/installed/GARM_version/lib
setenv MUMBIN  /path/where/you/installed/GARM_version/MUMmer3.22
setenv AMOSBIN /path/where/you/installed/GARM_version/amos-3.0.0/bin
setenv AMOSLIB /path/where/you/installed/GARM_version/amos-3.0.0/lib
setenv PATH {$PATH}:/path/where/you/installed/GARM_version	# add GARM.pl to PATH

In bash shell, could be .bashrc or .bash_profile add:


GARMBIN=/path/where/you/installed/GARM_version/bin
GARMLIB=/path/where/you/installed/GARM_version/lib
MUMBIN=/path/where/you/installed/GARM_version/MUMmer3.22
AMOSBIN=/path/where/you/installed/GARM_version/amos-3.0.0/bin
AMOSLIB=/path/where/you/installed/GARM_version/amos-3.0.0/lib
PATH=$PATH:/path/where/you/installed/GARM_version	# add GARM.pl to PATH

export $GARMBIN:$GARMLIB:$MUMBIN:$AMOSBIN:$AMOSLIB

Once you defined the environmental variables you can run the script "/path/where/you/installed/GARM_version/config.pl" to chech if everything it's okay.

INPUT:
- 2 fasta files with contigs or scaffolds from your assemblies.
- A text file (info_genomes_file) with the full path to those genome assembly results. The format is a 2 column file:
	First column: Full path file location of the assembly result
	Second column: A tag for the sequencing technology (Sanger, 454, Illumina, SOLiD, etc).
- For scaffolds, the program will take the number of N's between contigs as the estimated gap size (10 Ns is the minimum). The program will automatically select which result is better and will select it as a reference. In case you want to specifically use one scaffolds, write in the  info_genomes_file a third column with a "*" symbol next to the file you want to use.  

GARM USAGE:

Usage: GARM.pl -g info_genomes_file -o <prefix>
Options:
   -g: Text file with the path to two fasta genome assemblies to merge
   -o: Prefix for output directory and files
   -h: Complete help menu with advanced options

ADVANCE OPTIONS (Do not change unless you know what you are doing):
   -c <n>: Run with multi-threads with n cpus (default n=2)

EXAMPLE (HOW TO RUN)

- The info_genomes_file could have any name you like (eg. genomes.txt or config.file, etc). It should look like this:

/home/assembly_res/ABYSS/k31/kmer31-contigs.fasta ABYSS
/home/assembly_res/Newbler/454Scaffolds.fna 454

- The second column could be any tag that you will like to use to distinguish your assemblies (eg. ILLUM, SOLiD, 454, SOAP, Velvet, etc.). Just be sure that they are not the same. This tag would be used in case that you have contig names that are the same in the 2 results.

- Also check that your fasta files are in the location you wrote and that they are in a fasta format with no empty lines between sequences. 

- After setting up properly all the requirements and writing your info_genomes_file you can run GARM like this:

/path/where/GARM/is/installed/GARM.pl -g genomes.txt -o my_genome

OUTPUT:

- Inside the directory that you specified as output, you will find:

final_merged_contigs.fasta - This are the things in common to both results. Fasta file with your merged contigs.
final_merged.read.placed.filtered - Position of each original contig in the merged contigs. The file is in the WashU read placed format.
final_bin.fasta - Contigs with no overlaps
	            - Contigs excluded by the minimum length filter (200 bases default)
	            - Contigs with overlaps but not merged for some reason... (AMOS problem)
final.stats - genome stats for the results


  Column   Description
  ------   ----------------------------------------------------------
    1      NCBI ti number for read (or *, if none known)
    2      read name
    3      left trimmed position on the original read
    4      number of bases in trimmed read
    5      orientation on contig (0 = forward, 1 = reverse)
    6      contig name
    7      supercontig name (or *, if none known)
    8      approximate start position of the trimmed read in the contig
    9      approximate start position of the trimmed read on supercontig (or *, if none known)

- If there were scaffolds:

final_merged_scaffolds.fasta - Fasta file with the reconstructed scaffolds
final_merged_scaffolds.agp - AGP file for the scaffolding information

Any questions:

Alejandro Sanchez alexsf[at]ibt.unam.mx
Karel Estrada karel[at]ibt.unam.mx

Source: README.txt, updated 2015-05-05

GARM Meta assembler Files

Get an email when there's a new version of GARM Meta assembler