Analyzing poor data takes CPU time and interpreting the results from poor data takes people time, so it's always important to make a pre-processing.
Let me call my script as “Sequence_cleaner” and the big idea is to remove duplicate sequences, remove too short sequences ( the user defines the minimum length) and remove sequences which have too many unknown nucleotides (N) ( the user defines the % of N is allows ) and in the end the user can choose if he/she wants to have a file as output or print the result.
Follow Sequence Cleaner
You Might Also Like
Software Testing Platform | Testeum
Tired of bugs and poor UX going unnoticed despite thorough internal testing? Testeum is the SaaS crowdtesting platform that connects mobile and web app creators with carefully selected testers based on your criteria.
Rate This Project
Login To Rate This Project
User Reviews
-
Hi, to keep your computer performance better, delete your all duplicate file by using the utility program from DuplicateFilesDeleter.com. It works fast.