Download Latest Version Qudaich_v1.zip (419.7 kB)
Email in envelope

Get an email when there's a new version of Qudaich

Home / Qudaich_v1
Name Modified Size InfoDownloads / Week
Parent folder
source 2013-09-25
run_search_db.cpp 2013-09-25 9.9 kB
run_alignment.cpp 2013-09-25 5.9 kB
Makefile 2013-09-25 422 Bytes
README.txt 2013-08-19 3.6 kB
Totals: 5 Items   19.9 kB 0
INTRODUCTION
=============
Qudaich (queries and unique database alignment inferred by clustering homologs) is a software package for aligning sequences. Qudaich generates the pairwise local alignments between a query dataset against a database. The main design purpose of qudaich is to focus on datasets from next generation sequencing. These datasets generally have hundreds of thousand sequences or more, and the input database will likely contain a large number of sequences. Qudaich is flexible and its algorithmic structure imposes no restriction on the absolute limit of the acceptable read length, but the current version of qudaich allows read lengths <2000 bp. Qudaich can be used to align DNA, translated DNA and protein sequences.

Qudaich performs local sequence alignments in two steps:

1) Identify the candidate database sequences for each query sequence. The candidate database sequence is the database sequence that gives the best alignment or close enough to the best alignment with the corresponding query sequence.

2) Generate the optimal alignments between the query sequences and their candidate database sequences using Smith-Waterman-Gotoh algorithm.

Qudaich was written by:

Sajia Akhter, Ph.D.
Edwards Bioinformatics Lab (http://edwards.sdsu.edu/research/)
Computational Science Research Center (http://www.csrc.sdsu.edu/csrc/)
San Diego State University (http://www.sdsu.edu/)

COPYRIGHT
=========

Qudaich is Copyright 2010-2013 Sajia Akhter. 


INSTALLATION
=============
 
1. Uncompress the distribution Qudaich_*.zip
2. % make
3. % make -C source/

*** Qudaich is written in C/C++. So it requires gcc - GNU project C and C++ compiler - version 4.4.1 or later.


QUICK START
============

To find the candidate database sequences:

% ./qudaich_search_db options

Options
-------
-query                      Name of the query file (Required)
-ref                        Name of database file (Required)
-prog                       Alignment options (Required): n (nucleotide),
                                                          p (protein),
                                                          trn (translated nucleotide)
-top                        Number of alignments per query sequence (default 1)
-freqFile                   Frequency file Name (default freq.txt)
-hypo                       Hypothesis options: 1 (default) or 2
-h                          Show command line options


To generate the optimal alignments: % ./qudaich_alignment options

Options
-------
-f                           Options: all = generate alignments for all query sequences
                                      avg (default) = generate alignments for those query 
                                                      sequences whose frequency or sum(lcp) 
                                                      >= average of all query sequences
                                      an integer value = generate alignments for those query 
                                                      sequences whose frequency or sum(lcp) 
                                                      >= given integer value
-freqFile                    Name of the frequency file (default: freq.txt; This is the output file generated from ./qudaich_search_db)
-output                      Name of output file (default: output_qudaich.txt)
-match                       Match weight (default 1)
-mismatch                    Mismatch penalty (default -3)
-gap_open                    Gap opening penalty (default -1)
-gap_ext                     Gap extension penalty (default -2)
-h                           Show command line options

Source: README.txt, updated 2013-08-19