### RAFTS3: Rapid Alignment Free Tool for Sequences Similarity Search
RAFST3 is available for Linux 64 bits and Windows 64 bits as standalone
application (MCR required, see INSTALLATION). To use of functions in
Matlab, Bioinformatics Toolbox and Mex compiler are needed.
Copyright © 2013 Ricardo A. Vialle, Fábio O. Pedrosa, Vinicius A. Weiss,
Dieval Guizelini, Juliana H. Tibaes, Jeroniza N. Marchaukoski, Emanuel
M. de Souza and Roberto T. Raittz. All rights reserved.
Please report bugs, problems and enlightments to
ricardovialle@gmail.com.
---------------
###1. Introduction###
RAFTS3 can perform high-speed protein search comparisons locally using a
desktop computer or laptop. RAFTS3 performed searches much time faster
than those with BLASTp against large protein databases such as NR and
Pfam, with a small loss of sensitivity depending on the similarity
degree of the sequences. RAFTS3 is a new alternative for fast comparison
of protein sequences, genome annotation and biological data mining.
RAFTS3 utilizes a filter step for candidate selection based on shared
k-mers and a comparison measure using a binary matrix of co-occurrence
of amino acid residues.
###2. Installation###
####2.1. Requirements####
Before downloading and installing RAFTS3, make sure that your computer
or server fulfills the following requirements:
* UNIX or Windows platform;
* MCR (MATLAB Component Runtime).
####2.1.1. MCR (MATLAB Component Runtime)####
RAFTS3 requires the MATLAB Component Runtime to run. It can be found to
download in RAFTS3 project on SourceForge page
<https://sourceforge.net/projects/rafts3/files/>. To install MCR follow
these steps for chosen platform:
For Linux:
* Extract and install MCR (MATLAB Component Runtime):
> tar -xvzf MCR_LINUX64b.tar > cd MCR_LINUX64b > ./installMCR.sh <MCR
absolute destination path>
For Windows:
* Extract and install MCR (MATLAB Component Runtime):
* Unzip
MCR_WIN64b.zip
* Double click at MCR_WIN64b.exe
* Follow installation instructions
* Add the installed folder to environment variable PATH
"yourpath\MATLAB Compiler Runtime\v717\runtime\win64"
* Restart your computer
####2.2. Downloading RAFTS3####
RAFTS3 is available at <https://sourceforge.net/projects/rafts3/files/>.
After downloading the file, follow instructions below.
**For Linux:**
* Extract RAFTS3:
> tar -xvzf RAFTS3_LINUX64b.tar.gz > cd RAFTS3_LINUX64b
* Add execution permission for files:
> chmod u+x run_makerafts3db.sh makerafts3db run_rafts3.sh rafts3
run_rafts3x.sh rafts3x
**For Windows:**
* Unzip RAFTS3_WIN64b.zip
###3. RAFTS3 usage###
To perform searches against a database, first is needed run a database
formatting process. Use the MAKERAFTS3DB for it.
####3.1. MAKERAFTS3DB####
MAKERAFTS3DB uses a multi-fasta file as input, for example the NCBI NR
database. To run follow instructions bellow:
**For Linux:**
./run_makerafts3db.sh <MCR installation folder> database
[-num_inds int_value] [-kmer_size int_value] [-limit int_value]
[-out output_file]
**For Windows:**
makerafts3db.exe database [-num_inds int_value] [-kmer_size int_value]
[-limit int_value] [-out output_file]
#####3.1.1 MAKERAFTS3DB parameters#####
Below are the description of each available parameter for MAKERAFTS3DB.
+--------------+------------------------------------------------------+
| Parameter | Description |
|--------------+------------------------------------------------------|
| *database* | is the database, a fasta or multi-fasta formatted |
| | file with protein sequences. |
|*-num_inds* | specifies the number of k-mers randomly selected per |
| | sequence. Default is 10. |
|*-kmer_size* | is the length k of the k-mer. Default is 6. |
|*-limit* | is the length of the sequence used to select the |
| | k-mers. Default is 120. |
|*-out* | specifies output files names pattern. Default is |
| | <File_In>.mat |
+--------------+------------------------------------------------------+
####3.2. RAFTS3####
RAFTS3 searches protein sequences against a formatted database. To run
follow instructions bellow:
**For Linux:**
./run_rafts3.sh <MCR installation folder> input_file makerafts3db_file
[-num_correlation int_value] [-max_target_seqs int_value]
[-num_alignments int_value] [-out output_file] [-outfmt format]
[-max_headers_length int_value]
**For Windows:**
rafts3.exe input_file makerafts3db_file [-num_correlation int_value]
[-max_target_seqs int_value] [-num_alignments int_value]
[-out output_file] [-outfmt format] [-max_headers_length int_value]
#####3.2.1 RAFTS3 parameters#####
Below are the description of each available parameter for RAFTS3.
+-----------------------+---------------------------------------------+
| Parameter | Description |
|-----------------------+---------------------------------------------|
|*input_file* | is a fasta or multi-fasta formatted file |
| | with protein sequences. |
| *makerafts3db_file* | is the .mat file created with makerafts3db. |
| *-max_target_seqs* | specifies how many hits are shown for each |
| | query. The default is 1. |
| *-num_correlation* | specifies for each hit, how many BCOMs |
| | correlation are calculated.The default is |
| | 50. |
| *-num_alignments* | how many Smith-Waterman alignments are made |
| | for hits found. The default is 1. |
| *-out* | specifies output file name. Default is |
| | rafts3_output.txt. If 'stdout' specified, |
| | results are printed into screen. |
| *-outfmt* | specifies output formatting options. Options|
| | available are 'blast-like' and 'rafts3'. |
| | Default is 'blast-like'. |
| *-max_headers_length* | specifies the maximum length of query and |
| | subject header shown. Default is 22. |
+-----------------------+---------------------------------------------+
#####3.2.2 RAFTS3 output#####
Two output format are available for RAFTS3, 'blast-like' and 'rafts3'.
Each one is explained below.
blast-like - The output information is like the Blast tabular output
with alignment information. Each column containing information:
Query
Subject
Identity
Align length
Mismatches
Gap Opening
Query Start Position
Query End Position
Subject Start Position
Subject End Position
E-Value
BitScore
Query Coverage
Subject Coverage
Smith-Waterman Score
Relative Score
BCOM score
BCOM correlation
rafts3 - Tabular output format without alignment information, with each
column containing:
Query
Subject
BCOM score
BCOM correlation
####3.3. RAFTS3X####
RAFTS3X performs search on a protein database using a translated
nucleotide query. Its parameters and outputs are the same of regular
RAFTS3 for protein sequences.
####3.4. Instructions for use of functions in MatLab####
In order to perform searches against a protein sequences database, the
function 'makerafts3db.m' need to be executed. The returned structure is
needed to perform the searches using the function 'rafts3.m' for protein
query sequence or 'rafts3x.m' for translated nucleotide query. The
formated structure can be used as function parameter for searches,
avoiding loading. Mex files are compiled from .c files during first
execution.
For more information about each function and their parameters try help.
###4. Troubleshooting###
If you experience some problems with mcc-generated shell script on Linux
please try setting the LD_LIBRARY_PATH environment variable to let the
system know where to find Matlab Compiler Runtime library. To do this,
cut-and-paste mycsh and/or mybash instructions below into your
respective .cshrc or .bashrc file in your home directory. Don?t forget
to source .cshrc or source .bashrc at the csh or bash system prompt if
you want it effective immediately in the current shell. After that you
can execute applications directly from executable files, without specify
the MCR path.
#insert the following to your .cshrc file
#source .cshrc to make the additions effective for current session
#points to system default version of matlab
set MCRROOT=<MCR installation folder> set
LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64 set
MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client set
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}
setenv XAPPLRESDIR ${MCRROOT}/X11/app-defaults
# insert the following to your .bashrc file
# source .bashrc to make the additions effective for current session
# points to system default version of matlab
MCRROOT=<MCR installation folder>
LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64;
MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client;
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE};
XAPPLRESDIR=${MCRROOT}/X11/app-defaults;
export LD_LIBRARY_PATH;
export XAPPLRESDIR;
###5. License###
**IMPORTANT! READ CAREFULLY: THIS IS A LEGAL AGREEMENT** RAFTS3: Rapid
Alignment Free Tool for Sequences Similarity Search software Copyright ©
2013 Ricardo A. Vialle, Fábio O. Pedrosa, Vinicius A. Weiss, Dieval
Guizelini, Juliana H. Tibaes, Jeroniza N. Marchaukoski, Emanuel M. de
Souza and Roberto T. Raittz. All rights reserved.
RAFTS3 software development: Ricardo A. Vialle*, Fábio O. Pedrosa**,
Vinicius A. Weiss**, Dieval Guizelini*, Juliana H. Tibaes*, Jeroniza
N. Marchaukoski*, Emanuel M. de Souza** and Roberto T. Raittz*
* Laboratory of Bioinformatics
Federal University of Parana Department of Technological Education
Rua Dr. Alcides Vieira Arcoverde, 1225, Jardim das Américas
Curitiba PR Brazil
** Graduate Program in Science-Biochemistry
Federal University of Parana Department of Biological Sciences
Av Coronel Francisco Heráclito dos Santos, 210, Jardim das Américas
Curitiba PR Brazil
The RAFTS3 software is only available for download from
https://sourceforge.net/projects/rafts3/
By installing RAFTS3, you are agreeing to be bound by this agreement
("Agreement"). If you do not agree with all of the terms of this
Agreement, reject the Agreement by refusing to install the RAFTS3
software.
**1) Definitions**
**1.1)** "DEVs" means the developers and copyright owners of RAFTS3
software, as defined at the beginning of this License Agreement.
**1.2)** "Software" means all software and materials provided to you by
DEVs as part of the RAFTS3 program, excluding the MATLAB Compiler
Runtime (MCR).
**2) Ownership**
**2.1)** The Software is owned and copyrighted by DEVs. The MATLAB
Compiler Runtime (MCR) is provided royalty-free with the MATLAB Compiler
by The MathWorks, Inc. ("MathWorks"), and is redistributed by DEVs
subject to the terms of a license agreement between DEVs and MathWorks.
You may not directly or indirectly sell, lease, rent, redistribute,
license, sublicense, lend, give, or transfer the MATLAB Compiler Runtime
(MCR). This Agreement confers no title or ownership in the Software and
is not a sale of any rights in the Software.
**2.2)** This Agreement does not grant you and/or any person(s) acting
with you or for you, any rights or license with respect to the source
code of the Software.
**3) Disclaimer of Warranties & Limitation of Liability**
THE SOFTWARE AND THE MATLAB COMPILER RUNTIME (MCR) ARE PROVIDED BY DEVs
"AS IS" WITHOUT A WARRANTY OF ANY KIND, AND ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN
NO EVENT SHALL DEVs BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE AND THE MATLAB COMPILER RUNTIME (MCR), EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE. NO WARRANTY IS MADE THAT THE SOFTWARE WILL
MEET YOUR REQUIREMENTS.
**4) Indemnification**
You agree to fully indemnify and hold harmless DEVs from and against any
and all claims, liabilities, demands, suits, losses, damages, costs,
settlement amounts, and/or expenses, including but not limited to
attorneys' fees, arising out of your use of the Software.
The Software uses, with or without modifications, some custom MATLAB
functions ("custom functions") submitted by their copyright owners,
and under the BSD LICENSE, to The MathWorks File Exchange repository;
these custom functions are:
onbits, Copyright (c) 2008, James Tursa. All rights reserved.
prcorr2, Copyright (c) 2001-2002, Peter Rydesäter. All rights reserved.
The following BSD LICENSE applies to these custom functions only, referred
to as "THIS SOFTWARE":
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met: * Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. *
Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.