selectseq Icon


Get specific sequences from a multi-FASTA file.

Add a Review
1 Download (This Week)
Last Update:
Download selectseq
Browse All Files


A command-line utility to manipulate biological sequences from a multi-FASTA file. It can, given a list of identifiers, get only a subset of the sequences (or their complement, i.e., sequences NOT in the list). Can also get sequence number N only.

selectseq Web Site


  • collect only some sequences out of a large multi-FASTA file
  • get sequence number N only, regardless of ID
  • complement mode: return all sequences that are NOT in the list of IDs
  • "matching" mode: choose which part (between | characters) of the ID should match
  • sequence names provided one per line in a text file (first word in line used, or whatever is given to the -k option)
  • the > symbol is ignored if it is present in the beginning of IDs in the list (useful if using FASTA identifiers)
  • if only one sequence is needed, its ID can be given directly to the -l option (no need of a file)
  • add a suffix to IDs before searching (useful when IDs come from proteins that have _1 in the ID, but genes do not)
  • compressed sequence database files (-s) are supported
  • quite mode, output only important warnings and errors


Write a Review

User Reviews

Be the first to post a review of selectseq!

Additional Project Details

Intended Audience


User Interface


Programming Language




Thanks for helping keep SourceForge clean.

Screenshot instructions:
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks