Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
source | 2013-09-25 | ||
run_search_db.cpp | 2013-09-25 | 9.9 kB | |
run_alignment.cpp | 2013-09-25 | 5.9 kB | |
Makefile | 2013-09-25 | 422 Bytes | |
README.txt | 2013-08-19 | 3.6 kB | |
Totals: 5 Items | 19.9 kB | 0 |
INTRODUCTION ============= Qudaich (queries and unique database alignment inferred by clustering homologs) is a software package for aligning sequences. Qudaich generates the pairwise local alignments between a query dataset against a database. The main design purpose of qudaich is to focus on datasets from next generation sequencing. These datasets generally have hundreds of thousand sequences or more, and the input database will likely contain a large number of sequences. Qudaich is flexible and its algorithmic structure imposes no restriction on the absolute limit of the acceptable read length, but the current version of qudaich allows read lengths <2000 bp. Qudaich can be used to align DNA, translated DNA and protein sequences. Qudaich performs local sequence alignments in two steps: 1) Identify the candidate database sequences for each query sequence. The candidate database sequence is the database sequence that gives the best alignment or close enough to the best alignment with the corresponding query sequence. 2) Generate the optimal alignments between the query sequences and their candidate database sequences using Smith-Waterman-Gotoh algorithm. Qudaich was written by: Sajia Akhter, Ph.D. Edwards Bioinformatics Lab (http://edwards.sdsu.edu/research/) Computational Science Research Center (http://www.csrc.sdsu.edu/csrc/) San Diego State University (http://www.sdsu.edu/) COPYRIGHT ========= Qudaich is Copyright 2010-2013 Sajia Akhter. INSTALLATION ============= 1. Uncompress the distribution Qudaich_*.zip 2. % make 3. % make -C source/ *** Qudaich is written in C/C++. So it requires gcc - GNU project C and C++ compiler - version 4.4.1 or later. QUICK START ============ To find the candidate database sequences: % ./qudaich_search_db options Options ------- -query Name of the query file (Required) -ref Name of database file (Required) -prog Alignment options (Required): n (nucleotide), p (protein), trn (translated nucleotide) -top Number of alignments per query sequence (default 1) -freqFile Frequency file Name (default freq.txt) -hypo Hypothesis options: 1 (default) or 2 -h Show command line options To generate the optimal alignments: % ./qudaich_alignment options Options ------- -f Options: all = generate alignments for all query sequences avg (default) = generate alignments for those query sequences whose frequency or sum(lcp) >= average of all query sequences an integer value = generate alignments for those query sequences whose frequency or sum(lcp) >= given integer value -freqFile Name of the frequency file (default: freq.txt; This is the output file generated from ./qudaich_search_db) -output Name of output file (default: output_qudaich.txt) -match Match weight (default 1) -mismatch Mismatch penalty (default -3) -gap_open Gap opening penalty (default -1) -gap_ext Gap extension penalty (default -2) -h Show command line options