Home
Name Modified Size InfoDownloads / Week
CSBB-v2.1.zip 2017-05-31 794.2 MB
CSBB-v2.1_README.txt 2017-05-31 19.4 kB
Totals: 2 Items   794.3 MB 0
CSBB-v2.1
################################################## please read ###################################
****** CSBB is learning and evolving. If you have questions or Bugs please contact Praneet Chaturvedi on sourceforge.net or email to praneet198@hotmail.com or 
contact on https://github.com/csbbcompbio/CSBB-v2.0 ******
################################################## ############# #################################


################################################## ############# #################################
Author Info ::
Author/Developer : Praneet Chaturvedi
Designation : Analyst Bioinformatics @ Cincinnati Children's Hospital and Medical Center

Advisory Panel ::
Kashish Chetal [Analyst Bioinformatics @ Cincinnati Children's Hospital and Medical Center]

Sithara Raju Ponny [Analyst Bioinformatics @ Cincinnati Children's Hospital and Medical Center]

Simarjot Singh Pabla [Bioinformatics Scientist @ Agenus Inc]
################################################## ############# #################################


!!!CSBB-v2.1 offers seventeen statistical, visualization and bioinformatics modules for several bioinformatics applications. Detailed instructions on how to execute each module is give below. 
Also refer to additional documents in the package for definition and explanation of each module!!!

What's New:
	1) Fixed Issues with install module where perl cpan library was not getting installed for both Linux and Mac verison
	
    2) Fixed Issues where Linux users who didn't have root access were facing issues of R not able to install and use packages for some modules
    
    3) Added new visualization module named ExpressionPlot for visualizing trends of gene/entity expression/indformation across samples/objects
    
    4) Added new PCA visualization
    
    5) Bug Fixes [Performance and Stability related]
    
****** Version2.1 addition/updates only for MacOS and LINUX *********
****** RNA-SEQ pipeline does not work for windows currently ********
 

~~~Steps to make the application executable.

Step 1: Open the terminal/Command prompt in MAC OS, LINUX & Windows respectively.

Step 2: Browse to the directory where you have saved the package.
=================
cd —> change directory
ls  —> list directory
=================
For example: let’s say you have downloaded the package in Downloads.
====================================================
MACOS  & LINUX users: 
cd /Users/xxx/Downloads/CSBB-v2.1
Windows: 
cd C:\Users\xxx\Downloads\CSBB\WINDOWS_VERSION\
====================================================
Step 3: Type in console: perl -v

A detailed message describing perl version should be displayed:

This is perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level
(with 2 registered patches, see perl -V for more detail)

Copyright 1987-2013, Larry Wall

This confirms that perl is installed on your system. 
====================================================
If perl is not installed:

Mac Users: Use this link to install perl http://learn.perl.org/installing/osx.html  [With MAC OS perl and python comes installed]
Linux Users : Use this link to install perl http://www.activestate.com/activeperl/downloads [With Linux OS perl and python come pre-installed]
Windows User: Use this link to install perl http://learn.perl.org/installing/windows.html
====================================================

Step 4: Install R, python (already comes installed with MAC OS), Pandoc, Java and JDK
 Install R for MAC users: Use this link to install R https://cran.r-project.org/bin/macosx/
 Install R for Linux users : Use this link https://cran.r-project.org/bin/linux/
 Install R for Windows users: Use this link to install R https://cran.r-project.org/bin/windows/base/

Install Java and JDK for MAC and Linux Users: Use this link to download and install Java
https://www.java.com/en/ 

~~~Running each module

All modules can be run in two ways:
1) Giving all arguments from command line in one line [Batch Mode Run if you have more than one file as input].  
2) Providing arguments when prompted by the application.

Let’s see how to use —help option in the application.
====================================================
MAC USERS ::
perl CSBB-v2.1_MacOS.pl —help

LINUX USERS ::
perl CSBB-v2.1_Linux.pl —help
====================================================
Below is the sample output from —help option 

Operating System is darwin

=============================================

COMPUTATIONAL SUITE FOR BIOINFORMATICIANS & BIOLOGISTS
Version_2.1

Use --help for information on running COMPUTATIONAL SUITE FOR BIOINFORMATICIANS & BIOLOGISTS

Please See README and White-Paper for getting detailed instructions on running the program

Requirements for MacOSX version: Please install R, python 2.7+, pandoc, Java and Ruby

=============================================

CSBB is learning and evolving.. If you have questions or Bugs please contact Praneet Chaturvedi on github [https://github.com/csbbcompbio/CSBB-v1.0]


=====Happy to see you ... Lets get started and do some magic with your files =====

Please use options below to run the COMPUTATIONAL SUITE FOR BIOINFORMATICIANS & BIOLOGISTS

Options::

install                        ---> for installing all the required dependencies for RNA-SEQ pipelines ::only one time process


UpperQuantile                  ---> for performing upper quantile normalization


BasicStats                     ---> for obtaining stats like mean, median, standard deviation, variance, Sum, min and max for each Gene Expression profile


ExpressionToZscore             ---> for obtaining z-scores for Gene Expression in samples


ExtractGeneInfo                ---> for obtaining info/expression of list genes from a huge matrix gene info/expression


ExpressionPlot                 ---> for visualizing trends of gene/entity expression/indformation across samples/objects


InteractiveHeatmap             ---> for generating interactive heatmaps for expression data. User has three options on clustering type and four choices on color theme. Please read README for descriptions and run command.


CorrelationProfiles            ---> for obtaining genes correlation profile termed as positively and negatively correlated based on User threshold. One can obtain profile for all genes or just genes of interest (For genes of interest user needs to provide the path to gene list file). Heatmap will only be displayed for genes of interest


Biogrid-Gene-Protein-Search ---> for obtaining gene-protein interactions for Human and Mouse for genes of interest


DifferentialExpression         ---> for obtaining DE genes in RNA-SEQ expriments. Uses RUVSeq package in R


PCA                            ---> for performing Principal Component Analysis

NMF                            ---> for performing Nonnegative Matrix Factorization on Samples in Expression dataset

FetchSRA                       ---> for downloading Raw SRA data from NCBI SRA for example user can Download all the samples under SRA project id or just a single sample based on requirement. Just requires path to the folder where you want to download the data and SRA ID

FetchGEO                       ---> for downloading GEO expression data for example user can fetch expression of GSE Id, GSM Id, GDS Id. Just requires path to the folder where you want to download the data and GSE/GSM/GSD ID

InteractiveScatterPlot         ---> for generating Interactive Scatter plot based on user preference. User needs to provide path to the file, Column number for x-axis values, Column number for y-axis values and Column number which user needs for color factorization (If users provides No color factorization will not be done) Please see README for extensive explanation

Process-RNASeq_SingleEnd     ---> for processesing Single End RNASeq data using RSEM for two species hg19 [Human] or mm10 [Mouse]

Process-RNASeq_PairedEnd     ---> for processesing Paired End RNASeq data using RSEM for two species hg19 [Human] or mm10 [Mouse]

Generate-TPM-Counts-Matrix     ---> for generating TPM and Counts Matrix for Both Isoforms and Genes using RSEM result directory and species of interest

****** Please note that users can simply drag and drop the file/folder Finder Window (MAC) and LINUX respectively when path to a folder or a file is required.
******* LINUX users please remove quotes on the file and folder path when using drag and drop

****Note : CSBB-v2.1 requires user to run install module as a one time process to install all the dependencies required by CSBB

I) Running install:

====================================================
perl CSBB-v2.1_MacOS.pl install
perl CSBB-v2.1_Linux.pl install
====================================================
**** Will auto install all the dependencies for RNA-SEQ pipelines and CSBB based prerequisites 
**** This is one time process

II) Running UpperQuantile:
====================================================
perl CSBB-v2.1_MacOS.pl UpperQuantile Path_to_file
====================================================
Example Mac Users: perl CSBB-v2.1_MacOS.pl UpperQuantile /Users/xx/Desktop/FIND.txt
Example Linux Users: perl CSBB-v2.1_Linux.pl UpperQuantile /Users/xx/Desktop/FIND.txt

Now if you did not provide the path to file as an argument
Example: perl CSBB-v2.1_MacOS.pl UpperQuantile
— > Application will prompt to provide user input
Operating System is darwin

=============================================

COMPUTATIONAL SUITE FOR BIOINFORMATICIANS & BIOLOGISTS
Version_2.1

Use --help for information on running COMPUTATIONAL SUITE FOR BIOINFORMATICIANS & BIOLOGISTS

Please See README and White-Paper for getting detailed instructions on running the program

Requirements for MacOSX version: Please install R, python 2.7+, pandoc, Java and Ruby

=============================================

CSBB is learning and evolving .. If you have questions or Bugs please contact Praneet Chaturvedi on github [https://github.com/csbbcompbio/CSBB-v1.0]


=====Happy to see you ... Lets get started and do some magic with your files =====

Upper Quantile Normalization module loaded

You have forgot to give path to file from command line

Please provide the path to the file .. You can drag and drop the file from finder window path_to_file [User input is required]


III) Running BasicStats:

====================================================
perl CSBB-v2.1_MacOS.pl BasicStats Path_to_file
perl CSBB-v2.1_Linux.pl BasicStats Path_to_file
====================================================

Example Mac Users: perl CSBB-v2.1_MacOS.pl BasicStats /Users/xx/Desktop/FIND.txt
Example Linux Users: perl CSBB-v2.1_Linux.pl BasicStats /Users/xx/Desktop/FIND.txt


IV) Running ExpressionToZscore:

====================================================
perl CSBB-v2.1_MacOS.pl ExpressionToZscore Path_to_file
perl CSBB-v2.1_Linux.pl ExpressionToZscore Path_to_file
====================================================

Example Mac Users:  perl CSBB-v2.1_MacOS.pl ExpressionToZscore /Users/xx/Desktop/FIND.txt
Example Linux Users: perl CSBB-v2.1_Linux.pl ExpressionToZscore /Users/xx/Desktop/FIND.txt

V) Running ExtractGeneInfo:
====================================================
perl CSBB-v2.1_MacOS.pl ExtractGeneInfo Path_to_Expression/Info_file Gene_List_File
perl CSBB-v2.1_Linux.pl ExtractGeneInfo Path_to_Expression/Info_file Gene_List_File
====================================================

Example Mac Users: perl CSBB-v2.1_MacOS.pl ExtractGeneInfo /Users/xx/Desktop/FIND.txt /Users/xx/Desktop/FIND1.txt
Example Linux Users: perl CSBB-v2.1_Linux.pl ExtractGeneInfo /Users/xx/Desktop/FIND.txt /Users/xx/Desktop/FIND1.txt
***** Note ::: Extract Gene Info module has been updated so as to take gene list where gene can have multiple attributes. 
**** Note ::: Header for each file is must

VI) Running InteractiveHeatmap:

====================================================
perl CSBB-v2.1_MacOS.pl InteractiveHeatmap Path_to_File Clustering_Option [Row_Clust, Col_Clust or Row_Col_Clust] Color_theme [YellowGreenOrange, BlueWhiteRed, YellowBlackBlue or GreenWhitePurple]

perl CSBB-v2.1_Linux.pl InteractiveHeatmap Path_to_File Clustering_Option [Row_Clust, Col_Clust or Row_Col_Clust] Color_theme [YellowGreenOrange, BlueWhiteRed, YellowBlackBlue or GreenWhitePurple]
====================================================


VII) Running CorrelationProfiles:

====================================================
perl CSBB-v2.1_MacOS.pl CorrelationProfiles Path_to_File Correlation_Threshold [-1 to 1] all/Path to Gene list Correlation_Type[pearson or spearman or kendall]
perl CSBB-v2.1_Linux.pl CorrelationProfiles Path_to_File Correlation_Threshold [-1 to 1] all/Path to Gene list Correlation_Type[pearson or spearman or kendall]
====================================================

**use all to calculate correlation profiles all the genes/entities in the matrix.
**using path to gene list file to calculate correlation profile for a specified set of genes.
** Please specify which correlation method you want CSBB to use [use Pearson for linear dependency and Spearman when interested in ranked correlation]

VIII) Running Biogrid-Gene-Protein-Search:

====================================================
perl CSBB-v2.1_MacOS.pl Biogrid-Gene-Protein-Search Human/Mouse Path_to_gene_list 
perl CSBB-v2.1_Linux.pl Biogrid-Gene-Protein-Search Human/Mouse Path_to_gene_list
====================================================

IX) Running DifferentialExpression

====================================================
perl CSBB-v2.1_MacOS.pl DifferentialExpression Path_to_Counts_File Number_of_Controls Number_of_Treatments Counts_Threshold_for_filtering Number_of_Samples_for_Filtering_per_Gene Type_of_Normalization [UpperQuantile or UpperQuantile+Empirical]

perl CSBB-v2.1_Linux.pl DifferentialExpression Path_to_Counts_File Number_of_Controls Number_of_Treatments Counts_Threshold_for_filtering Number_of_Samples_for_Filtering_per_Gene Type_of_Normalization [UpperQuantile or UpperQuantile+Empirical]
====================================================

Example: perl CSBB-v2.1_MacOS.pl DifferentialExpression Path_to_Counts_File 10 10 5 8 UpperQuantile

**** Please read https://bioconductor.org/packages/release/bioc/vignettes/RUVSeq/inst/doc/RUVSeq.pdf for understanding types of Normalization
*** Generally if sequencing Quality is good then UpperQuantile normalization works best 
*** CSBB authors advise using both Normalization Types separately and see which gives you robust/best results

X) Running PCA

====================================================
perl CSBB-v2.1_MacOS.pl PCA Path_to_File
perl CSBB-v2.1_Linux.pl PCA Path_to_File
====================================================

XI) Running NMF
====================================================
perl CSBB-v2.1_MacOS.pl NMF Path_to_File
perl CSBB-v2.1_Linux.pl NMF Path_to_File
====================================================



XII) Running FetchSRA
====================================================
perl CSBB-v2.1_MacOS.pl FetchSRA Path_to_Folder_for_downloading_files SRA-id
perl CSBB-v2.1_Linux.pl FetchSRA Path_to_Folder_for_downloading_files SRA-id
====================================================


XIII) Running FetchGEO 

====================================================
perl CSBB-v2.1_MacOS.pl FetchGEO Path_to_Folder_for_downloading_files GEO-id
perl CSBB-v2.1_Linux.pl FetchGEO Path_to_Folder_for_downloading_files GEO-id
\====================================================

XIV) Running InteractiveScatterPlot

====================================================
perl CSBB-v2.1_MacOS.pl InteractiveScatterPlot Path_to_File Column_x_axis_values Column_y_axis_values Column_for_Color_Factorization/No
perl CSBB-v2.1_Linux.pl InteractiveScatterPlot Path_to_File Column_x_axis_values Column_y_axis_values Column_for_Color_Factorization/No


XV) Running Process-RNASeq_SingleEnd

====================================================
perl CSBB-v2.1_MacOS.pl Process-RNASeq_SingleEnd Path_to_Fastq_File Species [hg19 or mm10] Output_Folder_path Phred_Quality_encoding [phred33 or phred64 or solexa] Quality_Check [yes or no]

perl CSBB-v2.1_Linux.pl Process-RNASeq_SingleEnd Path_to_Fastq_File Species [hg19 or mm10] Output_Folder_path Phred_Quality_encoding [phred33 or phred64 or solexa] Quality_Check [yes or no]
====================================================
Example : 
perl CSBB-v2.1_MacOS.pl Process-RNASeq_SingleEnd /Users/xx/Desktop/my_fastq.fastq hg19 /Users/xx/Desktop phred33 yes
perl CSBB-v2.1_Linux.pl Process-RNASeq_SingleEnd /Users/xx/Desktop/my_fastq.fastq hg19 /Users/xx/Desktop phred33 yes

*** More information about phred quality encoding can be gathered from : https://en.wikipedia.org/wiki/FASTQ_format - Encoding
*** From 2011 Illumina [1.8+] version of pipelines use phred33 encoding for reads


XVI) Running Process-RNASeq_PairedEnd

====================================================
perl CSBB-v2.1_MacOS.pl Process-RNASeq_PairedEnd Path_to_Fastq_File_pair1 Path_to_Fastq_File_pair2 Species [hg19 or mm10] Output_Folder_path Phred_Quality_encoding [phred33 or phred64 or solexa] Quality_Check [yes or no]

perl CSBB-v2.1_Linux.pl Process-RNASeq_PairedEnd Path_to_Fastq_File_pair1 Path_to_Fastq_File_pair2 Species [hg19 or mm10] Output_Folder_path Phred_Quality_encoding [phred33 or phred64 or solexa] Quality_Check [yes or no]
====================================================
Example : 
perl CSBB-v2.1_MacOS.pl Process-RNASeq_PairedEnd /Users/xx/Desktop/my_fastq_pair1.fastq /Users/xx/Desktop/my_fastq_pair2.fastq hg19 /Users/xx/Desktop phred33 yes
perl CSBB-v2.1_Linux.pl Process-RNASeq_PairedEnd /Users/xx/Desktop/my_fastq_pair1.fastq /Users/xx/Desktop/my_fastq_pair2.fastq hg19 /Users/xx/Desktop phred33 yes

*** More information about phred quality encoding can be gathered from: https://en.wikipedia.org/wiki/FASTQ_format - Encoding
*** From 2011 Illumina [1.8+] version of pipelines use phred33 encoding for reads

XVII) Running Generate-TPM-Counts-Matrix

====================================================
perl CSBB-v2.1_MacOS.pl Generate-TPM-Counts-Matrix Path_to_RNA-SEQ_Result_Directory Species [hg19 or mm10] Path_to_Output_Directory

perl CSBB-v2.1_Linux.pl Generate-TPM-Counts-Matrix Path_to_RNA-SEQ_Result_Directory Species [hg19 or mm10] Path_to_Output_Directory
====================================================
Example : 
perl CSBB-v2.1_MacOS.pl Generate-TPM-Counts-Matrix /Users/xx/Desktop/CSBB_RNA-SEQ hg19 /Users/xx/Desktop
perl CSBB-v2.1_Linux.pl Generate-TPM-Counts-Matrix /Users/xx/Desktop/CSBB_RNA-SEQ hg19 /Users/xx/Desktop

XVIII) Running ExpressionPlot

====================================================
perl CSBB-v2.1_MacOS.pl ExpressionPlot Path_to_File
perl CSBB-v2.1_Linux.pl ExpressionPlot Path_to_File
====================================================

################################################################ please read ##############################################

****** CSBB is learning and evolving. If you have questions or Bugs please contact Praneet Chaturvedi on sourceforge.net or email to praneet198@hotmail.com or 
contact on https://github.com/csbbcompbio/CSBB-v2.0******

############################################################## END OF README ####################################################
Source: CSBB-v2.1_README.txt, updated 2017-05-31