SimplyTheBlast: a small perl tool to build genes presence/absence matrices over a set of Fasta formatted genomes.

This code requires:
Bio::SeqIO;
Bio::Perl;
Bio::Tools::Run::StandAloneBlast;
Bio::Seq;
Bio::Tools::Blast;
Bio::DB::GenBank;
Bio::DB::WebDBSeqI;

and BLAST 2.2.28 (blastall and formatcmd) installed and reachable from your command line

Usage: perl SimplyTheBlast-Align.pl <fasta formatted seeds file> <path to genomes folder> <Alignment length threshold in %> <Alignment identity threshold in %>

OR

Usage: perl SimplyTheBlast-Evalue.pl <fasta formatted seeds file> <path to genomes folder> <Evalue threshold>

Genomes files names must end with *.faa

Output files:
TABULAR_FBH_OUTPUT.xls is an Excel readable file with the identifier of the best hits found
TABULAR_FBH_OUTPUT.csv is a file with the number of the best hits found
query_n* files are fasta formatted files with the sequences of the best hits found
bugs /comments:
marco.fondi@unifi.it

Project Activity

See All Activity >

Follow SimplyTheBlast

SimplyTheBlast Web Site

You Might Also Like
Achieve perfect load balancing with a flexible Open Source Load Balancer Icon
Achieve perfect load balancing with a flexible Open Source Load Balancer

Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

Boost application security and continuity with SKUDONET ADC, our Open Source Load Balancer, that maximizes IT infrastructure flexibility. Additionally, save up to $470 K per incident with AI and SKUDONET solutions, further enhancing your organization’s risk management and cost-efficiency strategies.
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of SimplyTheBlast!

Additional Project Details

Registered

2014-11-14