Analyzing poor data takes CPU time and interpreting the results from poor data takes people time, so it's always important to make a pre-processing.

Let me call my script as “Sequence_cleaner” and the big idea is to remove duplicate sequences, remove too short sequences ( the user defines the minimum length) and remove sequences which have too many unknown nucleotides (N) ( the user defines the % of N is allows ) and in the end the user can choose if he/she wants to have a file as output or print the result.

Project Activity

See All Activity >

Follow Sequence Cleaner

Sequence Cleaner Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
Start Free
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
0
0
1
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Hi, to keep your computer performance better, delete your all duplicate file by using the utility program from DuplicateFilesDeleter.com. It works fast.
Read more reviews >

Additional Project Details

Registered

2014-06-21