VERSION HISTORY
Version | Feature | Comment |
---|---|---|
BPGA-1.3.0* | Latest | Recommended Version |
*In BPGA-1.3.0 or later, there is minor change in pathway calculations. It considers one enzyme involved in more than one pathways. Earlier versions used to consider one enzyme in only one pathway. | Last update on 12/May/2017 : introduced pan profile plot with combinations from matrix input. | |
[Older versions below this are not available] | [Do not download] | [Do not download] |
bpga-version-1.3-mswin-x64-0-0-0 or Higher | Work with upcoming GBK files having "accession.version" gene identifier (e.g YP_281368.1 ) as well as old NCBI GBK files. | NCBI is planning to eliminate GI ids from all its files from Nov. 2016. |
PREREQUISITES
Windows Requirements:
System: Windows XP or latter
Usearch: ** Get it from http://www.drive5.com/usearch/. Download and rename the Windows executables to "usearch.exe" [case sensitive, also mind that Windows file extensions are visible]
gnuplot: Download Gnupot_Win-64bit_Version or Download_Gnupot_Win-32bit_Version
WARNING (for Windows only) :** Please check “vcomp100.dll” system file in the system32 or system64 folder, path of this folder is as “C:\Windows\System32”. If not present copy it into above said folder. This file is required for USEARCH to function properly. This dll is available at this link
Linux Requirements:
Usearch : Get it from http://www.drive5.com/usearch/. Download and rename the Linux executable to "usearch". [case sensitive]
gnuplot: Linux users need to download exact version of gnuplot for linux from this SourceForge Page
Extract gnuplot 4.6.6. files by tar -xzf FILENAME.tar.gz
cd to gnuplot 4.6.6 directory simply run:
sudo ./configure
,
sudo make
,
sudo make install
to install gnuplot manually.
ghostscript: run sudo apt-get install ghostscript
wine (Ubuntu): sudo add-apt-repository ppa:ubuntu-wine/ppa -y && sudo apt-get update && sudo apt-get install wine
WARNING (for Debian only) : Make sure you have 'glibc 2.15' or higher (the C library in Debian).If not, you can carefully install higher version of glibc in parallel to 'glibc 2.14' or less. [Do not try to remove original glibc, as other binaries may not work properly.] As the post on one of the debian forum says: "One can install NEW glibc in parallel to OLD in some different place, e.g., in /opt. In fact, this is a common technique to make Google Chrome work on CentOS. I'm not saying this is as easy as 1-2-3, but it is certainly both doable and, if done properly, safe."
Alternatively, the steps are discussed here for running new applications on old glibc.
For BPGA Version 1 (Linux) only: Make sure that libgif4 (library) and libtiff4 are installed. Install 'libtiff4' and 'libgif4' from Ununtu Software Center or install manually as follows. Ubuntu 14.04 LTS 64 bit release includes 'libtiff5' library, BPGA may not start execution with it. (BPGA Version 1 currently needs 'libtiff4' and 'libgif4'). You may also get libtiff4 for different Ubuntu Releases from this link and libgif4 from this link. Install them manually using following commands (type exact file name that you downloaded) sudo dpkg -i ./libgif4_version_details_32_or_64_bit.deb
and sudo dpkg -i ./libtiff4_version_details_32_or_64_bit.deb
RUN BPGA
On Windows:
Install BPGA by simply double clicking on Installer file or extracting from zip.
Open BPGA folder and change directory to bin folder. Copy 'usearch.exe' to bin folder.
Run BPGA-Version-1.exe from bin folder for pan-genome analysis.
On Linux:
Extract the files from tar.gz by command:tar -zxvf bpga-version-X-linux-xxx-xx.tar.gz
Open BPGA folder and change directory to bin folder. Copy 'usearch' file to bin folder.
cd to bin and use commands:chmod +x BPGA-Version-XX
then ./BPGA-Version-XX
Note : xxx-xx stands for respective version. (should match which version you downloaded).
ANALYSIS OPTIONS
SUPPORTED INPUTS
BPGA requires any of the three types of input files of your dataset for analysis (*.gbk or Protein FASTA files from NCBI and HMP databases or any protein Fasta or binary 1,0 matrix).
LOCUS NC_017040 1750832 bp DNA circular CON 06-JUL-2013
DEFINITION Streptococcus pyogenes MGAS15252 chromosome, complete genome.
ACCESSION NC_017040
VERSION NC_017040.1 GI:383479207
...
FEATURES Location/Qualifiers
source 1..1750832
/organism="Streptococcus pyogenes MGAS15252"
...
gene 232..1587
/gene="dnaA"
...
CDS 232..1587
/gene="dnaA"
...
/product="chromosomal replication initiator protein DnaA"
/protein_id="YP_005388102.1"
/translation="MTENEQIFWNRVLELAQSQLKQATYEFFVHDARLLKVDKHIATI
...
>gi|19745202|ref|NP_606338.1| protein name [Organism Name]
MTENEQIFWNRVLELAQSQLKQATYEFFVHDARLLKVD
MRTNFKVSFYLRSNYENKEGKSPVMLRVFLNGEMSNFG
(Note that new NCBI faa files may not have the above format, they may match the following .pep.fsa format. In that case, user needs to use the pep.fsa option while using BPGA and rename the files accordingly before proceeding.)
>HMPREF9420_0006 protein name [Organism Name]
MRTNFKVSFYLRSNYENKEGKSPVMLRVFLNGEMSNFG
MTENEQIFWNRVLELAQSQLKQATYEFFVHDARLLKVD
>Any_header_information
MRTNFKVSFYLRSNYENKEGKSPVMLRVFLNGEMSNFG
MTENEQIFWNRVLELAQSQLKQATYEFFVHDARLLKVD
Note: All gene bank (.gbk) or Protein FASTA files should be concatenated in single file in case of more than one file for a single organism for example more than one chromosomes or contigs or scaffolds.
See BPGA User Manual.pdf for detailed instructions.
Email your queries to : bpgatool[at]gmail.com
How to cite this article:
Chaudhari, Narendrakumar M., Vinod Kumar Gupta, and Chitra Dutta. "BPGA-an ultra-fast pan-genome analysis pipeline." Scientific reports 6, 24373 (2016) ; doi: 10.1038/srep24373.
Visit Article
**Documentation**: FAQs
**Documentation**: Results
Dear Sir or madam,
I couldn' t process my default pangenome analysis in BPGA second step, I don't know whats' wrong for that? I emailed BPGAtool@gmail.com, after I send my input data from last Friday, until now nobody reply me back, please contact me soon, thanks.