Universal-Mer is a k-mer counting tool for all possible size of k at once. More than typical k-mer counting tools, the program can summarize the exact counting result of 1-mers to l-mers at once where l = the length of longest repeated substring in the input sequence where now the maximum length is set to 100000-mers. The program can report exactly the number of all repeat (freq>1) and unique(freq = 1) mers , and the number of all possible substrings of sequences without cutting off any low frequency mers.
Input files can be text or fasta format.The input alphabet now is only {A, C, G, T}. The input text files will be concerned as fasta format without description line. After counting and building the database of all possible length k of k-mer completed, User can choose in the menu any size of k to count a histogram, dump k-mers, query a substring, and summarize all possible k.
Features
- Summarize distinct repeated and unique 1-mers to l-mers where l = longest length of repeats
- Count a histogram of k-mers where 1<=k<= the length of an input sequence
- Count a histogram of canonical k-mers where 1<=k<= the length of an input sequence
- Dump k-mers where 1<=k<= the length of an input sequence
- Dump canonical k-mers where 1<=k<= the length of an input sequence
- Query a k-mer any size of k where k<=200000 in the input sequences.
- Find the longest repeated substring of an input sequence.