Home
Name Modified Size InfoDownloads / Week
duk.tar 2011-03-07 122.9 kB
README 2011-03-07 2.9 kB
reference.fa 2011-03-07 186 Bytes
seqfiles.cpp 2011-03-04 6.4 kB
seqfiles.h 2011-03-04 3.1 kB
parseOpts.cpp 2011-03-04 4.0 kB
poisson.cpp 2011-03-04 292 Bytes
sample.fastq 2011-03-04 4.4 kB
kmerhash.cpp 2011-03-04 4.6 kB
kmerhash.h 2011-03-04 2.1 kB
Makefile 2011-03-04 287 Bytes
kmercnt.cpp 2011-03-04 1.6 kB
kmercnt.pl 2011-03-04 793 Bytes
kmercoder.cpp 2011-03-04 2.1 kB
kmercoder.h 2011-03-04 501 Bytes
duk 2011-03-04 44.5 kB
duk.cpp 2011-03-04 8.3 kB
ContamRdsSim.pl 2011-03-04 5.9 kB
testkmerhash.cpp 2011-03-04 676 Bytes
Totals: 19 Items   215.6 kB 0
duk source code
March 03 2011
Mingkun Li

1. Summary
   Duk is a fast, accurate,and efficent DNA sequence matching tool using Kmer hashhing method.

2. Requirement
   Duk should run in any 64 bit Linux/Unix machine with g++.

3. Compile Instruction
   Just type  make to compile the tool.
   >make
4. Manual

  duk [options]  ref.fa query

  Options:
     -o, -output  <file>       print log information to file, default is stdout.
     -n, -nomatch <file>       output the not matched reads to file,
                               the opton value - stands for standard output
     -m, -match <file>      output matched reads to file.
     -k, -kmer                 the k mer size,  default is 16.
     -s, -step                 the step size, default is 4.
     -c, -cutoff               the cut off threshold for matched reads, default is 1.
     -h, -help                 print out the help information


    Identify the reads in query file whether they match to ref.fa reads.
    The ref.fa must be in fasta format and the query can be in fastq or fasta format.
    If there is no query file, the tool gets reads from standard input

    k=16 and c=1 are recommended for small genome like bacteria or adapter removal tasks. 
    k=20 and c=2 are recommended for large genome such as plant.

   Example: 
    ./duk  Illumina.Artifacts.fa  sample.fastq

5. License
   The full DUK package is distributed under FreeBSD License. 

   Copyright (c) 2011  <Mingkun Li>
   All rights reserved.

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
   are met:
   1. Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.
   2. Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
   3. The name of the author may not be used to endorse or promote products
      derived from this software without specific prior written permission.

      THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
      IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
      OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
      IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
      INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
      NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
      DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
      THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
      (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
      THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.



Source: README, updated 2011-03-07