Name Modified Size Downloads / Week Status
Totals: 9 Items   741.5 kB 8
ready-to-run 2010-08-20 0
algorithm.pdf 2010-10-12 38.7 kB 11 weekly downloads
readme.txt 2010-08-20 1.4 kB 11 weekly downloads
hg18_chr1_150_e3.fastg 2010-08-20 105.7 kB 11 weekly downloads
hg18_chr1_150_e2.fastg 2010-08-20 105.5 kB 11 weekly downloads
hg18_chr1_150_e1.fastg 2010-08-20 105.2 kB 11 weekly downloads
hg18_chr1_150_e0.fastg 2010-08-20 105.9 kB 11 weekly downloads
longpat 2010-08-20 160.3 kB 11 weekly downloads
longpattern.exe 2010-08-20 118.8 kB 11 weekly downloads
Generalized pattern matching Traditional pattern is AACGGGTGGTAAGGGAACC, and the generalized pattern is defined as AACGGG[0-5]TGGTAAG[0-5]GGAACC. Input: 1. -t text_file_list.txt it consists of the number of files in the first line, and the name of the files in the following lines. 2. -p pattern_file.fastg It looks like <E>P1(e1)[d1,D1]P2(e2)... Pc-1(ec-1)[dc-1,Dc-1]Pc(ec), where Pi are strings, and all the other variables are integes. *The delimiter symbols must be strictly followed. 3. -r seed length (k) Empirical, k = 11 or 12. 4. -o output prefix The prefix of the output files Usage: ./longpat -p hg18_chr1_50_e0.fastg -r 12 -t text_file_list.txt -o mytest Output: 1. mytest_occ.txt It shows the occurrences of the patterns hit in the filelist files 2. mytest_unhit.fastg It consists of all the patterns not hit in the filelist files 3. log.txt It contains intermediate message and error messages. 4. filename_r*.idx It is the coded k-mer index files built for every file in the filelist. ** VERY IMPORTANT ** It is not supposed to be modified by users. Happy with generalized pattern matching. Enquiries are welcome, and suggestions are mostly welcomed. Bing Ni (bni@cse.cuhk.edu.hk), Peter Lo (lylo@cse.cuhk.edu.hk) Computer Science and Engineering The Chinese University of Hong Kong
Source: readme.txt, updated 2010-08-20