reciprocal blast for windows Code

Pipeline for automatic reciprocal blast

Brought to you by: mnowotni-bio

Tree [9b9ff3] master / History

HTTPS access

File	Date	Author	Commit
project-data - Example	2014-08-24	Michal	[d888ea] updated license and some docs
tools	2014-08-25	Michal	[7f809a] uploading readme
workspace-template	2014-08-12	Michal	[cd145f] first
COPYING	2014-08-24	Michal	[d888ea] updated license and some docs
GO.bat	2014-08-24	Michal	[d888ea] updated license and some docs
LICENSES	2014-08-24	Michal	[d888ea] updated license and some docs
README.md	2014-08-25	Michal	[9b9ff3] up
README.md~	2014-08-25	Michal	[9b9ff3] up
process-data.bat	2014-08-12	Michal	[cd145f] first

Read Me

Reciprocal Blast for Windows

Complete set of scripts and applications to perform
reciprocal Blast via BLAST+ utilities.

Requirements

The following must be installed on system:

-Microsoft C++ redistributable
-Python 3+ (32 bit version)
-BioPython (available to download via pip)
-xlsxwriter (available to download via pip) to parse procedure output
-Python interpreter py.exe set in PATH environment variable

Installation

Install requirements and simply copy the reporistory on your harddrive.

Usage

Performing reciprocal Blast

Put sequences in format .fasta or .fasta.gz. in the project_data directory.
Files have to be put in the correct folders following the structure in the project_data -Example directory.
Remember to set correct file extensions.
Only .fasta or .fasta.gz are accepted.
Run GO.bat and enter the new workspace name. This script will manage entire procedure.
If you want to do it manually go to point 3.
Run batch script process-data.bat. This will mask your protein and nucleotide sequences and copy them to
the input_data directory.
Copy workspace_template and rename as you please. Remember to copy it to the same directory where
workspace_template is.
Open your new workspace and run in the following order:
-make_workspace.bat
-run_blast.bat

Tuning data

After creating workspace with script make_workspace.bat you can edit local sequence source files as you wish, since
they are only working copy. To create database from updated source sequences run make_blastdb.bat.

Obtaining output

All results are kept in the results directory. They are both in html and table formats.
To convert output into more readable excel tables, there is a python utility tool in the path
tools/mytools/processing/generate_tables.py
and its dependencies in the directory gen_lib.
Simply copy the files in processing folder to the workspace and run
from command line: py generate_tables.py results\species-name