Menu

Tree [fc2ecb] master /
 History

HTTPS access


File Date Author Commit
 Documentation 2012-12-17 Sébastien Boisvert Sébastien Boisvert [a5684a] Documentation: fixed the degree of the polytope
 code 2013-01-29 Sébastien Boisvert Sébastien Boisvert [fc2ecb] use constants for symbols
 scripts 2012-11-08 Sébastien Boisvert Sébastien Boisvert [2ba4d5] ship: removed 6 files in shipped products
 AUTHORS 2012-09-02 Sébastien Boisvert Sébastien Boisvert [2334be] Documentation: updated the author file
 CMakeLists.txt 2012-09-03 Sébastien Boisvert Sébastien Boisvert [4209dc] KmerAcademyBuilder: removed the k-mer academy
 INSTALL.txt 2011-09-23 Sébastien Boisvert Sébastien Boisvert [6aa7b5] Removed dependency for clock_gettime.
 LICENSE.txt 2012-01-30 Sébastien Boisvert Sébastien Boisvert [79f9cd] Moved license.
 MANUAL_PAGE.txt 2013-01-20 Sébastien Boisvert Sébastien Boisvert [622931] Mock: updated documentation for new export format
 Makefile 2013-01-24 Sébastien Boisvert Sébastien Boisvert [c02923] add debug symbols by default
 README.md 2012-11-14 Sébastien Boisvert Sébastien Boisvert [7c361f] Documentation: there is only one repository for...
 RayPlatform 2012-01-30 Sébastien Boisvert Sébastien Boisvert [dba631] The version of the RayPlatform is now written i...
 gpl-3.0.txt 2012-01-30 Sébastien Boisvert Sébastien Boisvert [79f9cd] Moved license.

Read Me

Ray assembler

Ray is a parallel de novo genome assembler that utilises the message-passing interface everywhere
and is implemented using peer-to-peer communication.

Ray is implemented using RayPlatform, a message-passing-interface programming framework.

Ray is documented in

  • Documentation/ (many files)
  • MANUAL_PAGE.txt (command-line options, same as Ray -help)
  • README.md (general)
  • INSTALL.txt (quick installation)

Solutions (all bundled in a single Product called Ray)

  • de novo genome assembly
  • de novo meta-genome assembly (with Ray Méta)
  • de novo transcriptome assembly (works, but not tested a lot)
  • quantification of contig abundances
  • quantification of microbiome consortia members (with Ray Communities)
  • quantification of transcript expression
  • taxonomy profiling of samples (with Ray Communities)
  • gene ontology profiling of samples (with Ray Ontologies)

see Documentation/BiologicalAbundances.txt

Website

Code repositories

If you want to contribute, clone the repository, make changes
and I (Sébastien Boisvert) will pull from you after reviewing
the code changes.

Other related repositories

Mailing lists

Installation

You need a C++ compiler (supporting C++ 1998), make, an implementation of MPI (supporting MPI 2.2).

Compilation

tar xjf Ray-x.y.z.tar.bz2
cd Ray-x.y.z
make PREFIX=build
make install
ls build

Compilation using CMake

tar xjf Ray-x.y.z.tar.bz2
cd Ray-x.y.z
mkdir build
cd build
cmake ..
make

Change the compiler

make PREFIX=build2000 MPICXX=/software/openmpi-1.4.3/bin/mpicxx
make install

Tested C++ compilers: see Documentation/COMPILERS.txt

Faster execution

Some processors have the popcnt instruction and other cool instructions.
With gcc, add -march=native to build Ray for the processor used for
the compilation.

make PREFIX=Build.native DEBUG=n ASSERT=n EXTRA=" -march=native"
make install

The best way to build Ray is to use whole-program optimization.
With gcc, use this script:

./scripts/Build-Link-Time-Optimization.sh

Use large k-mers

make PREFIX=Ray-Large-k-mers MAXKMERLENGTH=64
# wait
make install
mpirun -np 512 Ray-Large-k-mers/Ray -k 63 -p lib1_1.fastq lib1_2.fastq \
-p lib2_1.fastq lib2_2.fastq -o DeadlyBug,Assembler=Ray,K=63
# wait
ls DeadlyBug,Assembler=Ray,K=63/Scaffolds.fasta

Compilation options

make PREFIX=build-3000 MAXKMERLENGTH=64 HAVE_LIBZ=y HAVE_LIBBZ2=y \
ASSERT=n FORCE_PACKING=y
# wait
make install
ls build-3000

see the Makefile for more.

Run Ray

To run Ray on paired reads:

mpiexec -n 25 Ray -k31 -p lib1.left.fasta lib1.right.fasta -p lib2.left.fasta lib2.right.fasta -o RayOutput
ls RayOutput/Contigs.fasta
ls RayOutput/Scaffolds.fasta
ls RayOutput/

Using a configuration file

Ray can be run with a configuration file instead.

mpiexec -n 16 Ray Ray.conf

Content of Ray.conf:

-k 31 # this is a comment
-p
lib1.left.fasta
lib1.right.fasta

-p
lib2.left.fasta
lib2.right.fasta

-o RayOutput

Outputted files

RayOutput/Contigs.fasta and RayOutput/Scaffolds.fasta

type Ray -help for a full list of options and outputs

Color space

Ray assembles color-space reads and generate color-space contigs.
Files must have the .csfasta extension. Nucleotide reads can not be mixed
with color-space reads. This is an experimental feature.

Publications

http://denovoassembler.sf.net/publications.html

Code

Code documentation

cd code
doxygen DoxygenConfigurationFile
cd DoxygenDocumentation/html
firefox index.html

Useful links

Cloud computing

Message-passing interface

Funding

Doctoral Award to S.B., Canadian Institutes of Health Research (CIHR)

Authors

see AUTHORS

Compile Ray on Microsoft Windows with Microsoft Visual Studio

see Documentation/VISUAL_STUDIO.txt