File | Date | Author | Commit |
---|---|---|---|
project-data - Example | 2014-08-24 |
![]() |
[d888ea] updated license and some docs |
tools | 2014-08-25 |
![]() |
[7f809a] uploading readme |
workspace-template | 2014-08-12 |
![]() |
[cd145f] first |
COPYING | 2014-08-24 |
![]() |
[d888ea] updated license and some docs |
GO.bat | 2014-08-24 |
![]() |
[d888ea] updated license and some docs |
LICENSES | 2014-08-24 |
![]() |
[d888ea] updated license and some docs |
README.md | 2014-08-25 |
![]() |
[9b9ff3] up |
README.md~ | 2014-08-25 |
![]() |
[9b9ff3] up |
process-data.bat | 2014-08-12 |
![]() |
[cd145f] first |
Complete set of scripts and applications to perform
reciprocal Blast via BLAST+ utilities.
-Microsoft C++ redistributable
-Python 3+ (32 bit version)
-BioPython (available to download via pip)
-xlsxwriter (available to download via pip) to parse procedure output
-Python interpreter py.exe set in PATH environment variable
Install requirements and simply copy the reporistory on your harddrive.
Put sequences in format .fasta or .fasta.gz. in the project_data directory.
Files have to be put in the correct folders following the structure in the project_data -Example directory.
Remember to set correct file extensions.
Only .fasta or .fasta.gz are accepted.
Run GO.bat and enter the new workspace name. This script will manage entire procedure.
If you want to do it manually go to point 3.
Run batch script process-data.bat. This will mask your protein and nucleotide sequences and copy them to
the input_data directory.
Copy workspace_template and rename as you please. Remember to copy it to the same directory where
workspace_template is.
Open your new workspace and run in the following order:
-make_workspace.bat
-run_blast.bat
After creating workspace with script make_workspace.bat you can edit local sequence source files as you wish, since
they are only working copy. To create database from updated source sequences run make_blastdb.bat.
All results are kept in the results directory. They are both in html and table formats.
To convert output into more readable excel tables, there is a python utility tool in the path
tools/mytools/processing/generate_tables.py
and its dependencies in the directory gen_lib.
Simply copy the files in processing folder to the workspace and run
from command line: py generate_tables.py results\species-name