A command-line utility to manipulate biological sequences from a FASTA or FASTQ file. It can, given a list of identifiers, get only a subset of the sequences (or their complement, i.e., sequences NOT in the list). Can also get sequence number N only. Compressed sequences files are supported if readable by zcat.

Features

  • collect only some sequences out of a large FASTA or FASTQ file
  • get sequence number N only, regardless of ID
  • complement mode: return all sequences that are NOT in the list of IDs
  • "matching" mode: choose which part (between | characters) of the ID should match
  • sequence names provided one per line in a text file (first word in line used, or whatever is given to the -k option)
  • the > and @ symbols are ignored if present in the beginning of IDs in the list (useful if using FASTA or FASTQ identifiers)
  • if only one sequence is needed, its ID can be given directly to the -l option (no need of a file)
  • add a suffix to IDs before searching (useful when IDs come from proteins that have _1 in the ID, but genes do not)
  • compressed sequence database files (-s) are supported
  • quite mode, output only important warnings and errors

Project Activity

See All Activity >

Categories

Bio-Informatics

License

GNU General Public License version 3.0 (GPLv3)

Follow selectseq

selectseq Web Site

You Might Also Like
Cyber Risk Assessment and Management Platform Icon
Cyber Risk Assessment and Management Platform

ConnectWise Identify is a powerful cybersecurity risk assessment platform offering strategic cybersecurity assessments and recommendations.

When it comes to cybersecurity, what your clients don’t know can really hurt them. And believe it or not, keep them safe starts with asking questions. With ConnectWise Identify Assessment, get access to risk assessment backed by the NIST Cybersecurity Framework to uncover risks across your client’s entire business, not just their networks. With a clearly defined, easy-to-read risk report in hand, you can start having meaningful security conversations that can get you on the path of keeping your clients protected from every angle. Choose from two assessment levels to cover every client’s need, from the Essentials to cover the basics to our Comprehensive Assessment to dive deeper to uncover additional risks. Our intuitive heat map shows you your client’s overall risk level and priority to address risks based on probability and financial impact. Each report includes remediation recommendations to help you create a revenue-generating action plan.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of selectseq!

Additional Project Details

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

Perl

Related Categories

Perl Bio-Informatics Software

Registered

2011-05-20