<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Recent changes to UnixModeManual</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>Recent changes to UnixModeManual</description><atom:link href="https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/feed" rel="self"/><language>en</language><lastBuildDate>Sun, 21 Sep 2014 23:16:07 -0000</lastBuildDate><atom:link href="https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/feed" rel="self" type="application/rss+xml"/><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v7
+++ v8
@@ -6,7 +6,8 @@
   * UNIX; 
   * Python 2.7 and higher; 
   * R 2.11 and higher; 
-  * GCC 
+  * GCC;
+  * ImageMagick (optional) Required if you want a single combined QC PDF report

 &lt;h3&gt;RseqFlow Installation and Configuration&lt;/h3&gt;

@@ -15,40 +16,40 @@
 Step 2: Enter your specified directory: 

        bash-3.2$ cd /home/user/rseqflow
-&amp;lt;BR&amp;gt;
+

 Step 3: Extract the tar file: 

        bash-3.2$ tar -xvf RseqFlow_source.tar.gz 
-&amp;lt;BR&amp;gt;
+

 Step 4: Enter the directory: 

        bash-3.2$ cd /home/user/reseqflow/RseqFlow_source
-&amp;lt;BR&amp;gt;
+

 Step 5: Set up some tools: 

        bash-3.2$ ./make.sh 
-&amp;lt;BR&amp;gt;
+

 Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run RseqFlow from a new terminal window. 

        bash-3.2$ source ./configure.sh 
-&amp;lt;BR&amp;gt;
+

 Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run RseqFlow. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **

        bash-3.2$ cat configure.sh &amp;gt;&amp;gt; /home/user/.bashrc
-&amp;lt;BR&amp;gt;
+

 Step 8: If your system has multiple python versions, make sure you use version 2.7 or higher. Run 'pythonCompilerSet.sh' to set the proper python header in each of the python scripts. The example below assumes the path of the python executable is '/home/user/python2.7'. 

        bash-3.2$ ./pythonCompilerSet.sh -p /home/user/python2.7/python
-&amp;lt;BR&amp;gt;
+

 Step 9: Now you can run the corresponding shell scripts for each branch. 
-&amp;lt;BR&amp;gt;
+

 # 2\. Branch usage 

@@ -159,28 +160,28 @@
     QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup

 _This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
-&amp;lt;BR&amp;gt;  
+

     QC_SNP.sh -f Reads.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –p -q --ribo rRNA.fa

 _This command will align the Reads.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report._
-&amp;lt;BR&amp;gt;   
+  

     QC_SNP.sh -a Anno.gtf -o OutputPrefix -q --gSam AlignedToGenome.sam --tSam AlignedToTran.sam

 _This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (AlignedToGenome.sam) and transcriptome(AlignedToTran.sam)._
-&amp;lt;BR&amp;gt;   
+   

     QC_SNP.sh -1 Read1.fastq -2 Read2.fastq -g RefGene.fa -c RefTran.fa –a Anno.gtf -o OutputPrefix -p –q –s

 _This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (RefGene.fa) and transcriptome (RefTran.fa) and implement the pre and post alignment quality control analyses and SNP calling._
-&amp;lt;BR&amp;gt;    
+   

     QC_SNP.sh -1 Read1.fastq.gz -2 Read2.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –s --mito chrM.fa

 _This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._

-&amp;lt;BR&amp;gt;
+
 ## Branch 2: Expression Level Estimation (ExpressionEstimation.sh)
 ExpressionEstimation.sh will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 

@@ -250,12 +251,12 @@
     ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup

 _This command will align the single end Reads.fastq to the transcriptome reference sequence (RefTran.fa) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
-&amp;lt;BR&amp;gt;   
+  

     ExpressionEstimation.sh -a Anno.gtf -o OutputPrefix --tSam AlignedToTran.sam

 _This command uses the supplied .sam file that has already been aligned to the transcriptome. RseqFlow will implement the expression level estimation from the distinct best alignment for each read._
-&amp;lt;BR&amp;gt;
+

 ## Branch 3: Differentially Expressed Gene Identification (DE.sh)
 The "de" command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from ExpressionEstimation.sh. If both conditions have only one sample, the ExonExpressionLevel_unique files from ExpressionEstimation.sh are used as input. Otherwise, if either condition has more than one sample, the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input. 
@@ -304,12 +305,12 @@
     DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease

 _In this case, there are two samples with the disease condition, so the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input._
-&amp;lt;BR&amp;gt;   
+  

     DE.sh de --c1 Brain --c2 Colon --f1 Brain_whole_ExonExpressionLevel_unique.txt --f2 Colon_whole_ExonExpressionLevel_unique.txt –o TissueCompare

 _In this case, both conditions have only a single sample each, so the ExonExpressionLevel files from ExpressionEstimation.sh are used as input._
-&amp;lt;BR&amp;gt;
+

 ## Branch 4: Alignment File Format Conversion (FileFormatConversion.sh)
 FileFormatConversion.sh will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. 
@@ -366,17 +367,17 @@
     FileFormatConversion.sh -i in.sam -o out -b -r ref.fa

 _This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
-&amp;lt;BR&amp;gt;   
+   

     FileFormatConversion.sh -i in.sam -o out –m

 _This command will convert the input in.sam file to a mrf file._
-&amp;lt;BR&amp;gt; 
+

     FileFormatConversion.sh -i in.bam -o out –d

 _This command will convert the input in.bam file to a BED file for visualization._
-&amp;lt;BR&amp;gt;
+

 # 3\. Input files and formats
 ## Possible input files for QC_SNP.sh (Depends on the selected options):
@@ -388,19 +389,19 @@
   * Alignment files in SAM format 
   * Reference sequences of Mitochondria 
   * Reference sequences of Ribosomal RNA 
-&amp;lt;BR&amp;gt;
+
 ## Possible input files for ExpressionEstimation.sh (Depends on the selected options):

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
   * RNA-Seq fastq or fastq.gz file 
   * Alignment files in SAM format 
-&amp;lt;BR&amp;gt;
+
 ## Possible input files for DE.sh:
   * Output files from ExpressionEstimation.sh: 
        * whole_GeneExpressionLevel_unique.txt
        * whole_ExonExpressionLevel_unique.txt files
-&amp;lt;BR&amp;gt;
+
 ## Format specification of input files

 There should be three separate files: one for Transcriptome reference sequences, one for Genome reference sequences and one for the Genome Annotation file. RseqFlow will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the RseqFlow pipeline. 
@@ -415,7 +416,7 @@
 For example:

 &amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none
-&amp;lt;BR&amp;gt;
+

   * **Genome Reference Sequences**

@@ -428,13 +429,13 @@
 &amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF

 &amp;gt;chrM
-&amp;lt;BR&amp;gt;
+

   * **Genome Annotation**

 The Genome Annotation GTF file must be in format GTF3.0.
-&amp;lt;BR&amp;gt;
-&amp;lt;BR&amp;gt;
+
+

 # 4\. Sample datasets
 ## c. elegans
@@ -444,7 +445,7 @@
 [Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
 [Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
 [Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)
-&amp;lt;BR&amp;gt;
+

 &lt;h5&gt;Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:&lt;/h5&gt;

@@ -452,17 +453,17 @@

     mkdir c.elegans_Output

-&amp;lt;BR&amp;gt;
+
 i) Branch 1: QC_SNP.sh 

      QC_SNP.sh -f SRR514378.fastq -g genome_c.elegans.fa -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/QC_c.elegans -p -q -s

-&amp;lt;BR&amp;gt;
+
 ii) Branch 2: ExpressionEstimation.sh

      ExpressionEstimation.sh -f SRR514378.fastq -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/ExpressionEstimation_c.elegans

-&amp;lt;BR&amp;gt;
+
 iii) Branch 4: FileFormatConversion.sh

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -r genome_c.elegans.fa -b -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
@@ -473,7 +474,7 @@

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

-&amp;lt;BR&amp;gt;
+
 ## Human

 &lt;h5&gt;Download the following datasets:&lt;/h5&gt;
@@ -482,14 +483,14 @@
 [Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
 [Reference annotation gtf file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_anno.gtf.gz/download)
 [Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
-&amp;lt;BR&amp;gt;
+

 &lt;h5&gt;Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:&lt;/h5&gt;

 Before running the pipeline, create a new directory for output results: 

      mkdir Human_Output
-&amp;lt;BR&amp;gt;
+

 Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 

@@ -497,17 +498,17 @@
      gunzip Human_gencodeV14_transcripts.fa.gz
      gunzip Human_genome_GRCh37.fa.gz

-&amp;lt;BR&amp;gt;
+
  i) Branch 1: QC_SNP.sh

      QC_SNP.sh -f ERR030893_2M.fastq.gz -g Human_genome_GRCh37.fa -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf --mito Human_chrM -o Human_Output/QC_human -p -q -s

-&amp;lt;BR&amp;gt;
+
  ii) Branch 2: ExpressionEstimation.sh

      ExpressionEstimation.sh -f ERR030893_2M.fastq.gz -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf -o Human_Output/ExpressionEstimation_human

-&amp;lt;BR&amp;gt;
+
  iii) Branch 4: FileFormatConversion.sh

      FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -r  Human_genome_GRCh37.fa -b -o Human_Output/QC_human_Bowtie2_genome
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Sun, 21 Sep 2014 23:16:07 -0000</pubDate><guid>https://sourceforge.net8753a600d97cfa98e6d02cb8243b94fdcd04ae79</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v6
+++ v7
@@ -410,7 +410,7 @@

 The transcript names must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome end location as long as there is a space separating the chromosome end from the extra info. 

-GenomeName_AnnotationSource_TranscriptsID=Chromosome:Start-End [extra info]
+GenomeName_AnnotationSource_TranscriptsID=Chromosome:Start-End Extra Info

 For example:

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Fri, 31 Jan 2014 20:22:29 -0000</pubDate><guid>https://sourceforge.nete09a3e7bca42c3de8163cfe00a279262903f6f7b</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v5
+++ v6
@@ -1,31 +1,14 @@
-# Manual of RseqFlow Unix Run Mode
-
-  * Manual of RseqFlow Unix Run Mode
-    * 1\. Package installation and configuration
-&lt;br /&gt;
-    * 2\. Branch usage
-        * Branch 1: QC_SNP.sh
-        * Branch 2: ExpressionEstimation.sh
-        * Branch 3: DE.sh
-        * Branch 4: FileFormatConversion.sh
-    * 3\. Input files and formats
-        * Possible input files for QC_SNP.sh (Depends on the selected options)
-        * Possible input files for ExpressionEstimation.sh (Depends on the selected options)
-        * Possible input files for DE.sh
-        * Format specification of input files
-    * 4\. Sample datasets
-        * c. elegans
-        * Human
-
-## 1\. Package installation and configuration
-
+&lt;h1&gt;Manual of RseqFlow Unix Run Mode&lt;/h1&gt;
+[TOC]
+# 1\. Package installation and configuration
 The following packages must be pre-installed:
-    1. UNIX; 
-    2. Python 2.7 and higher; 
-    3. R 2.11 and higher; 
-    4. GCC 
-
-### RseqFlow Installation and Configuration
+
+  * UNIX; 
+  * Python 2.7 and higher; 
+  * R 2.11 and higher; 
+  * GCC 
+
+&lt;h3&gt;RseqFlow Installation and Configuration&lt;/h3&gt;

 Step 1: Download the [source code](http://sourceforge.net/projects/rseqflow/files/rseqflow2-v2.1.tar.gz/download) to your directory, e.g '/home/user/rseqflow'. 

@@ -66,16 +49,10 @@

 Step 9: Now you can run the corresponding shell scripts for each branch. 
 &lt;br /&gt;
-## 2\. Branch usage
-
-The pipeline consists of four branches: 
-
-  * Branch 1: Quality Control and SNP calling 
-  * Branch 2: Expression level estimation 
-  * Branch 3: Differentially expressed gene identification 
-  * Branch 4: Alignment file format conversion 
-
-### Branch 1: QC_SNP.sh
+
+# 2\. Branch usage 
+
+## Branch 1: Quality Control and SNP calling (QC_SNP.sh)

 QC_SNP.sh will implement quality control and/or SNP calling analysis for single end and paired end RNA-Seq datasets. 

@@ -174,7 +151,7 @@
 
 

-#### Examples:
+&lt;h4&gt;Examples:&lt;/h4&gt;

 Examples for running QC_SNP.sh:

@@ -182,18 +159,18 @@
     QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup

 _This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
-&lt;br /&gt;    
-
+&lt;br /&gt;  
+  
     QC_SNP.sh -f Reads.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –p -q --ribo rRNA.fa

 _This command will align the Reads.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report._
-&lt;br /&gt;    
-
+&lt;br /&gt;   
+ 
     QC_SNP.sh -a Anno.gtf -o OutputPrefix -q --gSam AlignedToGenome.sam --tSam AlignedToTran.sam

 _This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (AlignedToGenome.sam) and transcriptome(AlignedToTran.sam)._
-&lt;br /&gt;    
-
+&lt;br /&gt;   
+ 
     QC_SNP.sh -1 Read1.fastq -2 Read2.fastq -g RefGene.fa -c RefTran.fa –a Anno.gtf -o OutputPrefix -p –q –s

 _This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (RefGene.fa) and transcriptome (RefTran.fa) and implement the pre and post alignment quality control analyses and SNP calling._
@@ -204,8 +181,7 @@
 _This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._

 &lt;br /&gt;
-### Branch 2: ExpressionEstimation.sh
-
+## Branch 2: Expression Level Estimation (ExpressionEstimation.sh)
 ExpressionEstimation.sh will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 

 If bowtie2 indexes already exist in the same directory as the transcriptome reference file, RseqFlow will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference file for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the transcriptome reference. This will bypass index creation for future runs and reduce run times. 
@@ -267,22 +243,21 @@
 

-#### Examples:
+&lt;h4&gt;Examples:&lt;/h4&gt;

 Examples for running ExpressionEstimation.sh:

     ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup

 _This command will align the single end Reads.fastq to the transcriptome reference sequence (RefTran.fa) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
-&lt;br /&gt;    
-
+&lt;br /&gt;   
+ 
     ExpressionEstimation.sh -a Anno.gtf -o OutputPrefix --tSam AlignedToTran.sam

 _This command uses the supplied .sam file that has already been aligned to the transcriptome. RseqFlow will implement the expression level estimation from the distinct best alignment for each read._
 &lt;br /&gt;

-### Branch 3: DE.sh de
-
+## Branch 3: Differentially Expressed Gene Identification (DE.sh)
 The "de" command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from ExpressionEstimation.sh. If both conditions have only one sample, the ExonExpressionLevel_unique files from ExpressionEstimation.sh are used as input. Otherwise, if either condition has more than one sample, the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input. 

 &lt;h4&gt;Options for DE.sh de:&lt;/h4&gt;
@@ -305,12 +280,12 @@
 &lt;tr&gt;
   &lt;td&gt;-1/--f1&lt;/td&gt;
   &lt;td&gt; &lt;/td&gt;
-  &lt;td&gt;Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition1*&lt;/td&gt;
+  &lt;td&gt;Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition1&lt;sup&gt;*&lt;/sup&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;-2/--f2&lt;/td&gt;
   &lt;td&gt; &lt;/td&gt;
-  &lt;td&gt;Comma separated file name list for condition2. The files should be the Gene/Exon expression file(s) for condition2*&lt;/td&gt;
+  &lt;td&gt;Comma separated file name list for condition2. The files should be the Gene/Exon expression file(s) for condition2&lt;sup&gt;*&lt;/sup&gt;&lt;/td&gt;
 &lt;/tr&gt;
 &lt;tr&gt;
   &lt;td&gt;-o/--output-prefix&lt;/td&gt;
@@ -320,24 +295,23 @@
 

-\* If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 
-
-#### Examples:
+&lt;sup&gt;*&lt;/sup&gt; If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 
+
+&lt;h4&gt;Examples:&lt;/h4&gt;

 Examples for running DE.sh de:

     DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease

 _In this case, there are two samples with the disease condition, so the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input._
-&lt;br /&gt;  
-  
+&lt;br /&gt;   
+
     DE.sh de --c1 Brain --c2 Colon --f1 Brain_whole_ExonExpressionLevel_unique.txt --f2 Colon_whole_ExonExpressionLevel_unique.txt –o TissueCompare

 _In this case, both conditions have only a single sample each, so the ExonExpressionLevel files from ExpressionEstimation.sh are used as input._
 &lt;br /&gt;

-### Branch 4: FileFormatConversion.sh
-
+## Branch 4: Alignment File Format Conversion (FileFormatConversion.sh)
 FileFormatConversion.sh will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. 

 &lt;h4&gt;Options:&lt;/h4&gt;
@@ -385,28 +359,27 @@
 

-#### Examples:
+&lt;h4&gt;Examples:&lt;/h4&gt;

 Examples for running FileFormatConversion.sh:

     FileFormatConversion.sh -i in.sam -o out -b -r ref.fa

 _This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
-&lt;br /&gt;    
-
+&lt;br /&gt;   
+ 
     FileFormatConversion.sh -i in.sam -o out –m

 _This command will convert the input in.sam file to a mrf file._
-&lt;br /&gt;
-    
+&lt;br /&gt; 
+   
     FileFormatConversion.sh -i in.bam -o out –d

 _This command will convert the input in.bam file to a BED file for visualization._
 &lt;br /&gt;

-## 3\. Input files and formats
-
-#### Possible input files for QC_SNP.sh (Depends on the selected options)
+# 3\. Input files and formats
+## Possible input files for QC_SNP.sh (Depends on the selected options):

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
@@ -415,21 +388,20 @@
   * Alignment files in SAM format 
   * Reference sequences of Mitochondria 
   * Reference sequences of Ribosomal RNA 
-
-#### Possible input files for ExpressionEstimation.sh (Depends on the selected options)
+&lt;br /&gt;
+## Possible input files for ExpressionEstimation.sh (Depends on the selected options):

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
   * RNA-Seq fastq or fastq.gz file 
   * Alignment files in SAM format 
-
-#### Possible input files for DE.sh
-
+&lt;br /&gt;
+## Possible input files for DE.sh:
   * Output files from ExpressionEstimation.sh: 
        * whole_GeneExpressionLevel_unique.txt
-       * whole_ExonExpressionLevel_unique.txt files. 
-
-#### Format specification of input files
+       * whole_ExonExpressionLevel_unique.txt files
+&lt;br /&gt;
+## Format specification of input files

 There should be three separate files: one for Transcriptome reference sequences, one for Genome reference sequences and one for the Genome Annotation file. RseqFlow will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the RseqFlow pipeline. 

@@ -462,20 +434,19 @@

 The Genome Annotation GTF file must be in format GTF3.0.
 &lt;br /&gt;
-
-## 4\. Sample datasets
-
-### 1. c. elegans
-
-##### Download the following datasets:
-
+&lt;br /&gt;
+
+# 4\. Sample datasets
+## c. elegans
+
+&lt;h5&gt;Download the following datasets:&lt;/h5&gt;
 [RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/SRR514378.fastq.gz/download)
 [Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
 [Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
 [Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)
 &lt;br /&gt;

-##### Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:
+&lt;h5&gt;Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:&lt;/h5&gt;

 Before running the pipeline, create a new directory for output results: 

@@ -503,10 +474,9 @@
      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

 &lt;br /&gt;
-### 2. Human
-
-##### Download the following datasets:
-
+## Human
+
+&lt;h5&gt;Download the following datasets:&lt;/h5&gt;
 [RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/ERR030893_2M.fastq.gz/download)
 [Reference genome sequences file](http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz)
 [Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
@@ -514,7 +484,7 @@
 [Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
 &lt;br /&gt;

-##### Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:
+&lt;h5&gt;Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:&lt;/h5&gt;

 Before running the pipeline, create a new directory for output results: 

&lt;/tr&gt;&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Thu, 30 Jan 2014 00:32:04 -0000</pubDate><guid>https://sourceforge.netb13b3e73e45b555a34c5742101a533144d69a932</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v4
+++ v5
@@ -322,19 +322,19 @@

 \* If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 

-  * **Examples**
+#### Examples:

 Examples for running DE.sh de:

     DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease

 _In this case, there are two samples with the disease condition, so the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input._
-  
+&lt;br /&gt;  

     DE.sh de --c1 Brain --c2 Colon --f1 Brain_whole_ExonExpressionLevel_unique.txt --f2 Colon_whole_ExonExpressionLevel_unique.txt –o TissueCompare

 _In this case, both conditions have only a single sample each, so the ExonExpressionLevel files from ExpressionEstimation.sh are used as input._
-
+&lt;br /&gt;

 ### Branch 4: FileFormatConversion.sh

@@ -385,23 +385,24 @@
 

-  * **Examples**
-    
+#### Examples:
+ 
 Examples for running FileFormatConversion.sh:

     FileFormatConversion.sh -i in.sam -o out -b -r ref.fa

 _This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
-    
+&lt;br /&gt;    

     FileFormatConversion.sh -i in.sam -o out –m

 _This command will convert the input in.sam file to a mrf file._
-
+&lt;br /&gt;

     FileFormatConversion.sh -i in.bam -o out –d

 _This command will convert the input in.bam file to a BED file for visualization._
+&lt;br /&gt;

 ## 3\. Input files and formats

@@ -424,7 +425,9 @@

 #### Possible input files for DE.sh

-  * Output files from ExpressionEstimation.sh: whole_GeneExpressionLevel_unique.txt or whole_ExonExpressionLevel_unique.txt files. 
+  * Output files from ExpressionEstimation.sh: 
+       * whole_GeneExpressionLevel_unique.txt
+       * whole_ExonExpressionLevel_unique.txt files. 

 #### Format specification of input files

@@ -440,7 +443,7 @@
 For example:

 &amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none
-
+&lt;br /&gt;

   * **Genome Reference Sequences**

@@ -453,39 +456,42 @@
 &amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF

 &amp;gt;chrM
-
+&lt;br /&gt;

   * **Genome Annotation**

 The Genome Annotation GTF file must be in format GTF3.0.
-
+&lt;br /&gt;

 ## 4\. Sample datasets

 ### 1. c. elegans

-* **Download the following datasets:**
+##### Download the following datasets:

 [RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/SRR514378.fastq.gz/download)
 [Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
 [Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
 [Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)
-
-&lt;br /&gt;
-* **Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:** 
+&lt;br /&gt;
+
+##### Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:

 Before running the pipeline, create a new directory for output results: 

     mkdir c.elegans_Output

+&lt;br /&gt;
 i) Branch 1: QC_SNP.sh 

      QC_SNP.sh -f SRR514378.fastq -g genome_c.elegans.fa -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/QC_c.elegans -p -q -s

+&lt;br /&gt;
 ii) Branch 2: ExpressionEstimation.sh

      ExpressionEstimation.sh -f SRR514378.fastq -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/ExpressionEstimation_c.elegans

+&lt;br /&gt;
 iii) Branch 4: FileFormatConversion.sh

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -r genome_c.elegans.fa -b -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
@@ -496,23 +502,24 @@

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

-
+&lt;br /&gt;
 ### 2. Human

-* **Download the following datasets:**
+##### Download the following datasets:

 [RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/ERR030893_2M.fastq.gz/download)
 [Reference genome sequences file](http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz)
 [Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
 [Reference annotation gtf file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_anno.gtf.gz/download)
 [Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
-
-&lt;br /&gt;
-* **Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:** 
+&lt;br /&gt;
+
+##### Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:

 Before running the pipeline, create a new directory for output results: 

      mkdir Human_Output
+&lt;br /&gt;

 Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 

@@ -520,14 +527,17 @@
      gunzip Human_gencodeV14_transcripts.fa.gz
      gunzip Human_genome_GRCh37.fa.gz

+&lt;br /&gt;
  i) Branch 1: QC_SNP.sh

      QC_SNP.sh -f ERR030893_2M.fastq.gz -g Human_genome_GRCh37.fa -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf --mito Human_chrM -o Human_Output/QC_human -p -q -s

+&lt;br /&gt;
  ii) Branch 2: ExpressionEstimation.sh

      ExpressionEstimation.sh -f ERR030893_2M.fastq.gz -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf -o Human_Output/ExpressionEstimation_human

+&lt;br /&gt;
  iii) Branch 4: FileFormatConversion.sh

      FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -r  Human_genome_GRCh37.fa -b -o Human_Output/QC_human_Bowtie2_genome
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Wed, 29 Jan 2014 22:34:44 -0000</pubDate><guid>https://sourceforge.net2ee37e86870c96b2550f6855675aa6c34070af08</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v3
+++ v4
@@ -2,6 +2,7 @@

   * Manual of RseqFlow Unix Run Mode
     * 1\. Package installation and configuration
+&lt;br /&gt;
     * 2\. Branch usage
         * Branch 1: QC_SNP.sh
         * Branch 2: ExpressionEstimation.sh
@@ -18,51 +19,53 @@

 ## 1\. Package installation and configuration

-  * **Pre-install run environment**
+The following packages must be pre-installed:
     1. UNIX; 
     2. Python 2.7 and higher; 
     3. R 2.11 and higher; 
     4. GCC 

+### RseqFlow Installation and Configuration
+
 Step 1: Download the [source code](http://sourceforge.net/projects/rseqflow/files/rseqflow2-v2.1.tar.gz/download) to your directory, e.g '/home/user/rseqflow'. 

 Step 2: Enter your specified directory: 

        bash-3.2$ cd /home/user/rseqflow
-
+&lt;br /&gt;

 Step 3: Extract the tar file: 

        bash-3.2$ tar -xvf RseqFlow_source.tar.gz 
-
+&lt;br /&gt;

 Step 4: Enter the directory: 

        bash-3.2$ cd /home/user/reseqflow/RseqFlow_source
-
+&lt;br /&gt;

 Step 5: Set up some tools: 

        bash-3.2$ ./make.sh 
-
+&lt;br /&gt;

 Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run RseqFlow from a new terminal window. 

        bash-3.2$ source ./configure.sh 
-
+&lt;br /&gt;

 Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run RseqFlow. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **

        bash-3.2$ cat configure.sh &gt;&gt; /home/user/.bashrc
-
+&lt;br /&gt;

 Step 8: If your system has multiple python versions, make sure you use version 2.7 or higher. Run 'pythonCompilerSet.sh' to set the proper python header in each of the python scripts. The example below assumes the path of the python executable is '/home/user/python2.7'. 

        bash-3.2$ ./pythonCompilerSet.sh -p /home/user/python2.7/python
-
+&lt;br /&gt;

 Step 9: Now you can run the corresponding shell scripts for each branch. 
-
+&lt;br /&gt;
 ## 2\. Branch usage

 The pipeline consists of four branches: 
@@ -171,7 +174,7 @@
 
 

-  * **Examples**
+#### Examples:

 Examples for running QC_SNP.sh:

@@ -179,28 +182,28 @@
     QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup

 _This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
-    
+&lt;br /&gt;    

     QC_SNP.sh -f Reads.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –p -q --ribo rRNA.fa

 _This command will align the Reads.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report._
-    
+&lt;br /&gt;    

     QC_SNP.sh -a Anno.gtf -o OutputPrefix -q --gSam AlignedToGenome.sam --tSam AlignedToTran.sam

 _This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (AlignedToGenome.sam) and transcriptome(AlignedToTran.sam)._
-    
+&lt;br /&gt;    

     QC_SNP.sh -1 Read1.fastq -2 Read2.fastq -g RefGene.fa -c RefTran.fa –a Anno.gtf -o OutputPrefix -p –q –s

 _This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (RefGene.fa) and transcriptome (RefTran.fa) and implement the pre and post alignment quality control analyses and SNP calling._
-    
+&lt;br /&gt;    

     QC_SNP.sh -1 Read1.fastq.gz -2 Read2.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –s --mito chrM.fa

 _This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._

-
+&lt;br /&gt;
 ### Branch 2: ExpressionEstimation.sh

 ExpressionEstimation.sh will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 
@@ -264,19 +267,19 @@
 

-  * **Examples**
+#### Examples:

 Examples for running ExpressionEstimation.sh:

     ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup

 _This command will align the single end Reads.fastq to the transcriptome reference sequence (RefTran.fa) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
-    
+&lt;br /&gt;    

     ExpressionEstimation.sh -a Anno.gtf -o OutputPrefix --tSam AlignedToTran.sam

 _This command uses the supplied .sam file that has already been aligned to the transcriptome. RseqFlow will implement the expression level estimation from the distinct best alignment for each read._
-
+&lt;br /&gt;

 ### Branch 3: DE.sh de

&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Wed, 29 Jan 2014 22:22:01 -0000</pubDate><guid>https://sourceforge.net0e5e6b13a1d47a00bbad7f1c6e497d1267299aa6</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v2
+++ v3
@@ -1,4 +1,4 @@
-# Manual of `RseqFlow` Unix Run Mode
+# Manual of RseqFlow Unix Run Mode

   * Manual of RseqFlow Unix Run Mode
     * 1\. Package installation and configuration
@@ -8,13 +8,13 @@
         * Branch 3: DE.sh
         * Branch 4: FileFormatConversion.sh
     * 3\. Input files and formats
-        * (1) Possible input files for QC_SNP.sh (Depends on the selected options)
-        * (2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)
-        * (3) Possible input files for DE.sh
-        * (4) Format specification of input files
+        * Possible input files for QC_SNP.sh (Depends on the selected options)
+        * Possible input files for ExpressionEstimation.sh (Depends on the selected options)
+        * Possible input files for DE.sh
+        * Format specification of input files
     * 4\. Sample datasets
-        * (1) c. elegans
-        * (2) Human
+        * c. elegans
+        * Human

 ## 1\. Package installation and configuration

@@ -46,12 +46,12 @@
        bash-3.2$ ./make.sh 

-Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run `RseqFlow` from a new terminal window. 
+Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run RseqFlow from a new terminal window. 

        bash-3.2$ source ./configure.sh 

-Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run `RseqFlow`. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **
+Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run RseqFlow. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **

        bash-3.2$ cat configure.sh &gt;&gt; /home/user/.bashrc

@@ -72,7 +72,7 @@
   * Branch 3: Differentially expressed gene identification 
   * Branch 4: Alignment file format conversion 

-### (1) Branch 1: QC_SNP.sh
+### Branch 1: QC_SNP.sh

 QC_SNP.sh will implement quality control and/or SNP calling analysis for single end and paired end RNA-Seq datasets. 

@@ -86,70 +86,96 @@

 If you already have RseqFlow alignment file(s) in SAM format for the genome and/or transcriptome, you may input the .sam files and the alignment step will be skipped. The post-alignment and/or SNP calling will be analyzed based on the merged result only. 

-  * **Options**
-
-Options 
-arguments 
-Meaning 
-
--f/--fastq 
-.fastq or .fastq.gz 
-RNA-Seq dataset with single end reads in FASTQ format (fastq or fastq.gz) 
-
--1/--read1 
-.fastq or .fastq.gz 
-The first reads file for paired end data in FASTQ format (fastq or fastq.gz) 
-
--2/--read2 
-.fastq or .fastq.gz 
-The second reads file for paired end data in FASTQ format (fastq or fastq.gz) 
-
--g/--genome 
-.fa 
-Genome reference sequences in FASTA format 
-
--c/-transcriptome 
-.fa 
-Transcriptome reference sequences in FASTA format 
-
--a/--annotation 
-.gtf 
-Reference annotation in GTF format 
-
--o/--output-prefix 
-prefix 
-Prefix of output files (default setting is “QC_SNP_output”) 
-
--p/--pre-QC 
-Generate pre-alignment QC reports based only on the RNA-Seq iput fastq file 
-
--q/--QC 
-Generate post-QC reports based on alignment 
-
--s/--SNP 
-Implement SNPs calling analysis 
-
-\--ribo 
-.fa 
-Reference sequences of ribosomal RNA in FASTA format 
-
-\--mito 
-.fa 
-Reference sequences of Mitochondrial chromosome in FASTA format 
-
-\--gSam 
-.sam 
-Alignment to genome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to genome) 
-
-\--tSam 
-.sam 
-Alignment to transcriptome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to transcriptome) 
-
-\--cleanup 
-Delete temporary files (default setting is to save all files) 
+
+&lt;h4&gt;Options:&lt;/h4&gt;
+&lt;table border="1"&gt;
+&lt;tr&gt;
+&lt;th&gt;Option&lt;/th&gt;
+&lt;th&gt;Argument&lt;/th&gt;
+&lt;th&gt;Description&lt;/th&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-f/--fastq&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;RNA-Seq dataset with single end reads in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-1/--read1&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;The first reads file for paired end data in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;  
+  &lt;td&gt;-2/--read2&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;The second reads file for paired end data in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-g/--genome&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Genome reference sequences in FASTA format&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-c/-transcriptome&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Transcriptome reference sequences in FASTA format&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-a/--annotation&lt;/td&gt;
+  &lt;td&gt;.gtf&lt;/td&gt;
+  &lt;td&gt;Reference annotation in GTF format&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-o/--output-prefix&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Prefix of output files (default setting is “QC_SNP_output”)&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt; 
+  &lt;td&gt;-p/--pre-QC&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Generate pre-alignment QC reports based only on the RNA-Seq iput fastq file&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt; 
+  &lt;td&gt;-q/--QC&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Generate post-QC reports based on alignment&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-s/--SNP&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Implement SNPs calling analysis&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--ribo&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Reference sequences of ribosomal RNA in FASTA format&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt; 
+  &lt;td&gt;--mito&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Reference sequences of Mitochondrial chromosome in FASTA format&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--gSam&lt;/td&gt;
+  &lt;td&gt;.sam&lt;/td&gt;
+  &lt;td&gt;Alignment to genome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to genome)&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--tSam&lt;/td&gt;
+  &lt;td&gt;.sam&lt;/td&gt;
+  &lt;td&gt;Alignment to transcriptome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to transcriptome)&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--cleanup&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Delete temporary files (default setting is to save all files)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;/table&gt;

   * **Examples**
-    
+
+Examples for running QC_SNP.sh:
+
+
     QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup

 _This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
@@ -175,7 +201,7 @@
 _This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._

-### (2) Branch 2: ExpressionEstimation.sh
+### Branch 2: ExpressionEstimation.sh

 ExpressionEstimation.sh will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 

@@ -185,47 +211,63 @@

 If you already have an alignment file to the transcriptome, you may input the .sam file and the alignment step will be skipped. ExpressionEstimation.sh will implement expression level estimation. 

-Please note that ExpressionEstimation.sh is expecting a bowtie2 created samfile. If your pre-existing samfile was not created with bowtie2, it is strongly suggested that you run `ExpressionEstimation.sh` without a samfile to let RseqFlow create its own, otherwise there is no guarantee that `ExpressionEstimation` will use only unique alignments. 
-
-  * **Options**
-
-Options 
-arguments 
-Meaning 
-
--f/--fastq 
-.fastq or .fastq.gz 
-The reads file for single end data in FASTQ format (fastq or fastq.gz) 
-
--1/--read1 
-.fastq or .fastq.gz 
-The first reads file for paired end data in FASTQ format (fastq or fastq.gz) 
-
--2/--read2 
-.fastq or .fastq.gz 
-The second reads file for paired end data in FASTQ format (fastq or fastq.gz) 
-
--c/-transcriptome 
-.fa 
-Transcriptome reference sequences in FASTA format 
-
--a/--annotation 
-.gtf 
-Reference annotation in GTF format 
-
--o/--output-prefix 
-prefix 
-Prefix of output files (default is Expression_output) 
-
-\--tSam 
-.sam 
-Alignments to the transcriptome in SAM format 
-
-\--cleanup 
-Delete temporary files (default setting is to save all files) 
+Please note that ExpressionEstimation.sh is expecting a bowtie2 created samfile. If your pre-existing samfile was not created with bowtie2, it is strongly suggested that you run ExpressionEstimation.sh without a samfile to let RseqFlow create its own, otherwise there is no guarantee that ExpressionEstimation will use only unique alignments. 
+
+
+&lt;h4&gt;Options:&lt;/h4&gt;
+&lt;table border="1"&gt;
+&lt;tr&gt;
+&lt;th&gt;Option&lt;/th&gt;
+&lt;th&gt;Argument&lt;/th&gt;
+&lt;th&gt;Description&lt;/th&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-f/--fastq&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;The reads file for single end data in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-1/--read1&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;The first reads file for paired end data in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;  
+  &lt;td&gt;-2/--read2&lt;/td&gt;
+  &lt;td&gt;.fastq or .fastq.gz&lt;/td&gt;
+  &lt;td&gt;The second reads file for paired end data in FASTQ format (fastq or fastq.gz)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-c/-transcriptome&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Transcriptome reference sequences in FASTA format&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-a/--annotation&lt;/td&gt;
+  &lt;td&gt;.gtf&lt;/td&gt;
+  &lt;td&gt;Reference annotation in GTF format&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-o/--output-prefix&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Prefix of output files (default setting is “Expression_output”)&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--tSam&lt;/td&gt;
+  &lt;td&gt;.sam&lt;/td&gt;
+  &lt;td&gt;Alignment to transcriptome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to transcriptome)&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--cleanup&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Delete temporary files (default setting is to save all files)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;/table&gt;
+

   * **Examples**

+Examples for running ExpressionEstimation.sh:
+
     ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup

 _This command will align the single end Reads.fastq to the transcriptome reference sequence (RefTran.fa) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
@@ -236,40 +278,51 @@
 _This command uses the supplied .sam file that has already been aligned to the transcriptome. RseqFlow will implement the expression level estimation from the distinct best alignment for each read._

-### (3) Branch 3: DE.sh
-
-The de command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from ExpressionEstimation.sh. If both conditions have only one sample, the ExonExpressionLevel_unique files from ExpressionEstimation.sh are used as input. Otherwise, if either condition has more than one sample, the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input. 
-
-  * **Options**
-
-1) DE.sh de [options]
-
-Options 
-arguments 
-Meaning 
-
-\--c1 
-Condition1_ID 
-ID for condition1 (e.g. Control) 
-
-\--c2 
-Condition2_ID 
-ID for condition2 (e.g. Case) 
-
--1/--f1 
-Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition1`*`
-
--2/--f2 
-Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition2`*`
-
--o/--output-prefix 
-prefix 
-Prefix of output files 
-
-* If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 
+### Branch 3: DE.sh de
+
+The "de" command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from ExpressionEstimation.sh. If both conditions have only one sample, the ExonExpressionLevel_unique files from ExpressionEstimation.sh are used as input. Otherwise, if either condition has more than one sample, the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input. 
+
+&lt;h4&gt;Options for DE.sh de:&lt;/h4&gt;
+&lt;table border="1"&gt;
+&lt;tr&gt;
+&lt;th&gt;Option&lt;/th&gt;
+&lt;th&gt;Argument&lt;/th&gt;
+&lt;th&gt;Description&lt;/th&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;--c1&lt;/td&gt;
+  &lt;td&gt;Condition1_ID&lt;/td&gt;
+  &lt;td&gt;ID for condition1 (e.g. Control)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;  
+  &lt;td&gt;--c2&lt;/td&gt;
+  &lt;td&gt;Condition2_ID&lt;/td&gt;
+  &lt;td&gt;ID for condition2 (e.g. Case)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-1/--f1&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition1*&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-2/--f2&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Comma separated file name list for condition2. The files should be the Gene/Exon expression file(s) for condition2*&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-o/--output-prefix&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Prefix of output files&lt;/td&gt;
+&lt;/tr&gt;
+&lt;/table&gt;
+
+
+\* If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 

   * **Examples**

+Examples for running DE.sh de:
+
     DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease

 _In this case, there are two samples with the disease condition, so the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input._
@@ -280,42 +333,59 @@
 _In this case, both conditions have only a single sample each, so the ExonExpressionLevel files from ExpressionEstimation.sh are used as input._

-### (4) Branch 4: FileFormatConversion.sh
+### Branch 4: FileFormatConversion.sh

 FileFormatConversion.sh will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. 

-  * **Options**
-
-Options 
-arguments 
-Meaning 
-
--i/--input 
-File to convert 
-Input file to convert, with the correct file suffix (.sam, .bam, or .mrf) 
-
--o/--output 
-prefix 
-Prefix of output files 
-
--r/--reference 
-Reference sequence file that was used to align the input file 
-If a file is being converted from SAM to BAM format, the .sam file should contain either the appropriate @SQ headers or you need to use the --reference option to supply the reference file that was used to create the .sam file. 
-
--b/--toBAM 
-Convert SAM to BAM 
-
--m/--toMRF 
-Convert SAM to MRF 
-
--d/--toBED 
-Convert BAM to BED 
-
--w/--toWIG 
-Convert SAM to WIG or MRF to WIG 
+&lt;h4&gt;Options:&lt;/h4&gt;
+&lt;table border="1"&gt;
+&lt;tr&gt;
+&lt;th&gt;Option&lt;/th&gt;
+&lt;th&gt;Argument&lt;/th&gt;
+&lt;th&gt;Description&lt;/th&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-i/--input&lt;/td&gt;
+  &lt;td&gt;.sam, .bam, .mrf&lt;/td&gt;
+  &lt;td&gt;Input file to convert, with the correct file suffix (.sam, .bam, or .mrf)&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-o/--output-prefix&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Prefix of output files&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;  
+  &lt;td&gt;-r/--reference&lt;/td&gt;
+  &lt;td&gt;.fa&lt;/td&gt;
+  &lt;td&gt;Reference sequence file in FASTA format that was used to align the input file. If a file is being converted from SAM to BAM format, the .sam file should contain either the appropriate @SQ headers or you need to use the --reference option to supply the reference file that was used to create the .sam file.&lt;/td&gt; 
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-b/--toBAM&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Convert SAM to BAM&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-m/--toMRF&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Convert SAM to MRF&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-d/--toBED&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Convert BAM to BED&lt;/td&gt;
+&lt;/tr&gt;
+&lt;tr&gt;
+  &lt;td&gt;-w/--toWIG&lt;/td&gt;
+  &lt;td&gt; &lt;/td&gt;
+  &lt;td&gt;Convert SAM to WIG or MRF to WIG&lt;/td&gt;
+&lt;/tr&gt;
+&lt;/table&gt;
+

   * **Examples**

+Examples for running FileFormatConversion.sh:
+
     FileFormatConversion.sh -i in.sam -o out -b -r ref.fa

 _This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
@@ -332,7 +402,7 @@

 ## 3\. Input files and formats

-### (1) Possible input files for QC_SNP.sh (Depends on the selected options)
+#### Possible input files for QC_SNP.sh (Depends on the selected options)

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
@@ -342,64 +412,64 @@
   * Reference sequences of Mitochondria 
   * Reference sequences of Ribosomal RNA 

-### (2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)
+#### Possible input files for ExpressionEstimation.sh (Depends on the selected options)

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
   * RNA-Seq fastq or fastq.gz file 
   * Alignment files in SAM format 

-### (3) Possible input files for DE.sh
-
-  * Output files from `ExpressionEstimation.sh`: whole_GeneExpressionLevel_unique.txt or whole_ExonExpressionLevel_unique.txt files. 
-
-### (4) Format specification of input files
-
-The following annotations and references should be in separate files. RseqFlow will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the RseqFlow pipeline. 
-
-  * **Genome Annotation GTF file**
-
-Format from GTF 2.0 to GTF3.0 is required; 
+#### Possible input files for DE.sh
+
+  * Output files from ExpressionEstimation.sh: whole_GeneExpressionLevel_unique.txt or whole_ExonExpressionLevel_unique.txt files. 
+
+#### Format specification of input files
+
+There should be three separate files: one for Transcriptome reference sequences, one for Genome reference sequences and one for the Genome Annotation file. RseqFlow will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the RseqFlow pipeline. 
+

   * **Transcriptome Reference Sequences**

 The transcript names must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome end location as long as there is a space separating the chromosome end from the extra info. 

-“&gt;$GenomeName_$AnnotationSource_$TranscriptsID=$Chromosome:$Start-$End [extra info]”
-
-For example, 
-
-“&amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none” 
+GenomeName_AnnotationSource_TranscriptsID=Chromosome:Start-End [extra info]
+
+For example:
+
+&amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none
+

   * **Genome Reference Sequences**

 The chromosome name must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome as long as there is a space separating the chromosome from the extra info. 

-“&amp;gt;$chromsome” 
-
-For example, 
-
-“&amp;gt;chr1 dna:chromosome” 
-
-"&amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF"
-
-“&amp;gt;chrM” 
+For example: 
+
+&amp;gt;chr1 dna:chromosome 
+
+&amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF
+
+&amp;gt;chrM
+
+
+  * **Genome Annotation**
+
+The Genome Annotation GTF file must be in format GTF3.0.
+

 ## 4\. Sample datasets

-### (1) c. elegans
-
-We offer the sample datasets for c. elegans: 
-
-[(1) RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/SRR514378.fastq.gz/download)
-
-[(2) Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
-
-[(3) Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
-
-[(4) Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)
-
-(5) Commands for running three of the branches: 
+### 1. c. elegans
+
+* **Download the following datasets:**
+
+[RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/SRR514378.fastq.gz/download)
+[Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
+[Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
+[Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)
+
+&lt;br /&gt;
+* **Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:** 

 Before running the pipeline, create a new directory for output results: 

@@ -409,11 +479,11 @@

      QC_SNP.sh -f SRR514378.fastq -g genome_c.elegans.fa -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/QC_c.elegans -p -q -s

-ii) Branch 2: `ExpressionEstimation.sh`
+ii) Branch 2: ExpressionEstimation.sh

      ExpressionEstimation.sh -f SRR514378.fastq -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/ExpressionEstimation_c.elegans

-iii) Branch 4: `FileFormatConversion.sh`
+iii) Branch 4: FileFormatConversion.sh

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -r genome_c.elegans.fa -b -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

@@ -423,25 +493,25 @@

      FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

-### (2) Human
-
-[(1) RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/ERR030893_2M.fastq.gz/download)
-
-[(2) Reference genome sequences file](http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz)
-
-[(3) Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
-
-[(4) Reference annotation gtf file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_anno.gtf.gz/download)
-
-[(5) Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
-
-(6) Commands for running branches #1,#2, and #4: 
-
- Before running the pipeline, create a new directory for output results: 
+
+### 2. Human
+
+* **Download the following datasets:**
+
+[RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/ERR030893_2M.fastq.gz/download)
+[Reference genome sequences file](http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz)
+[Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
+[Reference annotation gtf file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_anno.gtf.gz/download)
+[Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
+
+&lt;br /&gt;
+* **Commands for running QC_SNP, ExpressionEstimation, and FileFormatConversion:** 
+
+Before running the pipeline, create a new directory for output results: 

      mkdir Human_Output

- Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 
+Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 

      gunzip Human_gencodeV14_anno.gtf.gz
      gunzip Human_gencodeV14_transcripts.fa.gz
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Wed, 29 Jan 2014 21:50:24 -0000</pubDate><guid>https://sourceforge.net60205ecbf3a7b9fcfeba3aa8b755682eec2f4d93</guid></item><item><title>UnixModeManual modified by J.Herstein</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;pre&gt;--- v1
+++ v2
@@ -2,19 +2,19 @@

   * Manual of RseqFlow Unix Run Mode
     * 1\. Package installation and configuration
-    * 2\. Usage of each branch
-      * (1) Branch 1: QC_SNP.sh
-      * (2) Branch 2: ExpressionEstimation.sh
-      * (3) Branch 3: DE.sh
-      * (4) Branch 4: FileFormatConversion.sh
+    * 2\. Branch usage
+        * Branch 1: QC_SNP.sh
+        * Branch 2: ExpressionEstimation.sh
+        * Branch 3: DE.sh
+        * Branch 4: FileFormatConversion.sh
     * 3\. Input files and formats
-      * (1) Possible input files for QC_SNP.sh (Depends on the selected options)
-      * (2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)
-      * (3) Possible input files for DE.sh
-      * (4) Format specification of input files
+        * (1) Possible input files for QC_SNP.sh (Depends on the selected options)
+        * (2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)
+        * (3) Possible input files for DE.sh
+        * (4) Format specification of input files
     * 4\. Sample datasets
-      * (1) c. elegans
-      * (2) Human
+        * (1) c. elegans
+        * (2) Human

 ## 1\. Package installation and configuration

@@ -24,39 +24,46 @@
     3. R 2.11 and higher; 
     4. GCC 

-&gt; Step 1: Download the [source code](http://code.google.com/p/rseqflow/downloads/detail?name=RseqFlow_source.tar.gz&amp;amp;can=2&amp;amp;q=) to your directory, e.g '/home/user/rseqflow'. 
-
-&gt; Step 2: Enter your specified directory: 
-&gt;     
-&gt;       bash-3.2$ cd /home/user/rseqflow
-
-&gt; Step 3: Extract the tar file: 
-&gt;     
-&gt;       bash-3.2$ tar -xvf RseqFlow_source.tar.gz 
-
-&gt; Step 4: Enter the directory: 
-&gt;     
-&gt;       bash-3.2$ cd /home/user/reseqflow/RseqFlow_source
-
-&gt; Step 5: Set up some tools: 
-&gt;     
-&gt;       bash-3.2$ ./make.sh 
-
-&gt; Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run `RseqFlow` from a new terminal window. 
-&gt;     
-&gt;       bash-3.2$ source ./configure.sh 
-
-&gt; Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run `RseqFlow`. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **
-&gt;     
-&gt;       bash-3.2$ cat configure.sh &gt;&gt; /home/user/.bashrc
-
-&gt; Step 8: If your system has multiple python versions, make sure you use version 2.7 or higher. Run 'pythonCompilerSet.sh' to set the proper python header in each of the python scripts. The example below assumes the path of the python executable is '/home/user/python2.7'. 
-&gt;     
-&gt;       bash-3.2$ ./pythonCompilerSet.sh -p /home/user/python2.7/python
-
-&gt; Step 9: Now you can run the corresponding shell scripts for each branch. 
-
-## 2\. Usage of each branch
+Step 1: Download the [source code](http://sourceforge.net/projects/rseqflow/files/rseqflow2-v2.1.tar.gz/download) to your directory, e.g '/home/user/rseqflow'. 
+
+Step 2: Enter your specified directory: 
+     
+       bash-3.2$ cd /home/user/rseqflow
+
+
+Step 3: Extract the tar file: 
+     
+       bash-3.2$ tar -xvf RseqFlow_source.tar.gz 
+
+
+Step 4: Enter the directory: 
+     
+       bash-3.2$ cd /home/user/reseqflow/RseqFlow_source
+
+
+Step 5: Set up some tools: 
+     
+       bash-3.2$ ./make.sh 
+
+
+Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run `RseqFlow` from a new terminal window. 
+     
+       bash-3.2$ source ./configure.sh 
+
+
+Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run `RseqFlow`. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. **Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! **
+     
+       bash-3.2$ cat configure.sh &gt;&gt; /home/user/.bashrc
+
+
+Step 8: If your system has multiple python versions, make sure you use version 2.7 or higher. Run 'pythonCompilerSet.sh' to set the proper python header in each of the python scripts. The example below assumes the path of the python executable is '/home/user/python2.7'. 
+     
+       bash-3.2$ ./pythonCompilerSet.sh -p /home/user/python2.7/python
+
+
+Step 9: Now you can run the corresponding shell scripts for each branch. 
+
+## 2\. Branch usage

 The pipeline consists of four branches: 

@@ -65,19 +72,19 @@
   * Branch 3: Differentially expressed gene identification 
   * Branch 4: Alignment file format conversion 

-### (1) Branch 1: `QC_SNP.sh`
-
-&gt; QC_SNP.sh will implement quality control and/or SNP calling analysis for single end and paired end RNA-Seq datasets. 
-
-&gt; For pre-alignment quality control analysis only, a fastq(.gz) file is required as input, either a single fastq for single end data or Read1.fastq and Read2.fastq for paired end data. 
-
-&gt; For post-alignment quality control analysis, the fastq files(s), genome reference sequences, transcriptome reference sequences, and annotation file must be input. QC_SNP.sh will align the RNA-Seq dataset to the genome and transcriptome, then merge the alignment results for quality control and SNP calling. 
-
-&gt; You may input ribosomal RNA reference sequences and/or mitochondrial reference sequences if desired. 
-
-&gt; If bowtie2 indexes already exist in the same directory as the genome and/or transcriptome reference files, `RseqFlow` will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference files for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the corresponding genome/transcriptome reference. This will bypass index creation for future runs and reduce run times. 
-
-&gt; If you already have alignment file(s) in SAM format for the genome and/or transcriptome, you may input the .sam files and the alignment step will be skipped. The post-alignment and/or SNP calling will be analyzed based on the merged result only. 
+### (1) Branch 1: QC_SNP.sh
+
+QC_SNP.sh will implement quality control and/or SNP calling analysis for single end and paired end RNA-Seq datasets. 
+
+For pre-alignment quality control analysis only, a fastq(.gz) file is required as input, either a single fastq for single end data or Read1.fastq and Read2.fastq for paired end data. 
+
+For post-alignment quality control analysis, the fastq files(s), genome reference sequences, transcriptome reference sequences, and annotation file must be input. QC_SNP.sh will align the RNA-Seq dataset to the genome and transcriptome, then merge the alignment results for quality control and SNP calling. 
+
+You may input ribosomal RNA reference sequences and/or mitochondrial reference sequences if desired. 
+
+If bowtie2 indexes already exist in the same directory as the genome and/or transcriptome reference files, RseqFlow will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference files for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the corresponding genome/transcriptome reference. This will bypass index creation for future runs and reduce run times. 
+
+If you already have RseqFlow alignment file(s) in SAM format for the genome and/or transcriptome, you may input the .sam files and the alignment step will be skipped. The post-alignment and/or SNP calling will be analyzed based on the merged result only. 

   * **Options**

@@ -145,35 +152,40 @@

     QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup

-&gt; _This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
-    
+_This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files._
+    
+
     QC_SNP.sh -f Reads.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –p -q --ribo rRNA.fa

-&gt; _This command will align the Reads.fastq.gz to the genome (`RefGene.fa`), transcriptome (`RefTran.fa`) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report._
-    
+_This command will align the Reads.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report._
+    
+
     QC_SNP.sh -a Anno.gtf -o OutputPrefix -q --gSam AlignedToGenome.sam --tSam AlignedToTran.sam

-&gt; _This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (`AlignedToGenome.sam`) and transcriptome(`AlignedToTran.sam`)._
-    
+_This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (AlignedToGenome.sam) and transcriptome(AlignedToTran.sam)._
+    
+
     QC_SNP.sh -1 Read1.fastq -2 Read2.fastq -g RefGene.fa -c RefTran.fa –a Anno.gtf -o OutputPrefix -p –q –s

-&gt; _This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (`RefGene.fa`) and transcriptome (`RefTran.fa`) and implement the pre and post alignment quality control analyses and SNP calling._
-    
+_This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (RefGene.fa) and transcriptome (RefTran.fa) and implement the pre and post alignment quality control analyses and SNP calling._
+    
+
     QC_SNP.sh -1 Read1.fastq.gz -2 Read2.fastq.gz -g RefGene.fa -c RefTran.fa -a Anno.gtf -o OutputPrefix –s --mito chrM.fa

-&gt; _This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (`RefGene.fa`), transcriptome (`RefTran.fa`) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._
-
-### (2) Branch 2: `ExpressionEstimation.sh`
-
-&gt; `ExpressionEstimation.sh` will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 
-
-&gt; If bowtie2 indexes already exist in the same directory as the transcriptome reference file, `RseqFlow` will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference file for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the transcriptome reference. This will bypass index creation for future runs and reduce run times. 
-
-&gt; If a fastq file and the reference sequences of the transcriptome are supplied as input, `ExpressionEstimation.sh` will align the RNA-Seq dataset to the transcriptome and then implement expression level estimation. 
-
-&gt; If you already have an alignment file to the transcriptome, you may input the .sam file and the alignment step will be skipped. `ExpressionEstimation.sh` will implement expression level estimation. 
-
-&gt; Please note that `ExpressionEstimation.sh` is expecting a bowtie2 created samfile. If your pre-existing samfile was not created with bowtie2, it is strongly suggested that you run `ExpressionEstimation.sh` without a samfile to let `RseqFlow` create its own, otherwise there is no guarantee that `ExpressionEstimation` will use only unique alignments. 
+_This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (RefGene.fa), transcriptome (RefTran.fa) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments._
+
+
+### (2) Branch 2: ExpressionEstimation.sh
+
+ExpressionEstimation.sh will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. 
+
+If bowtie2 indexes already exist in the same directory as the transcriptome reference file, RseqFlow will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference file for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the transcriptome reference. This will bypass index creation for future runs and reduce run times. 
+
+If a fastq file and the reference sequences of the transcriptome are supplied as input, ExpressionEstimation.sh will align the RNA-Seq dataset to the transcriptome and then implement expression level estimation. 
+
+If you already have an alignment file to the transcriptome, you may input the .sam file and the alignment step will be skipped. ExpressionEstimation.sh will implement expression level estimation. 
+
+Please note that ExpressionEstimation.sh is expecting a bowtie2 created samfile. If your pre-existing samfile was not created with bowtie2, it is strongly suggested that you run `ExpressionEstimation.sh` without a samfile to let RseqFlow create its own, otherwise there is no guarantee that `ExpressionEstimation` will use only unique alignments. 

   * **Options**

@@ -216,19 +228,21 @@

     ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup

-&gt; _This command will align the single end Reads.fastq to the transcriptome reference sequence (`RefTran.fa`) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
-    
+_This command will align the single end Reads.fastq to the transcriptome reference sequence (RefTran.fa) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted._
+    
+
     ExpressionEstimation.sh -a Anno.gtf -o OutputPrefix --tSam AlignedToTran.sam

-&gt; _This command uses the supplied .sam file that has already been aligned to the transcriptome. `RseqFlow` will implement the expression level estimation from the distinct best alignment for each read._
-
-### (3) Branch 3: `DE.sh`
-
-&gt; The de command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from `ExpressionEstimation.sh`. If both conditions have only one sample, the `ExonExpressionLevel_unique` files from `ExpressionEstimation.sh` are used as input. Otherwise, if either condition has more than one sample, the `GeneExpressionLevel_unique` files from `ExpressionEstimation.sh` are used as input. 
+_This command uses the supplied .sam file that has already been aligned to the transcriptome. RseqFlow will implement the expression level estimation from the distinct best alignment for each read._
+
+
+### (3) Branch 3: DE.sh
+
+The de command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from ExpressionEstimation.sh. If both conditions have only one sample, the ExonExpressionLevel_unique files from ExpressionEstimation.sh are used as input. Otherwise, if either condition has more than one sample, the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input. 

   * **Options**

-1) DE.sh de `[options]`
+1) DE.sh de [options]

 Options 
 arguments 
@@ -252,21 +266,23 @@
 prefix 
 Prefix of output files 

-`*` If both conditions have only one sample each, use the `ExonExpressionLevel_unique` files from `ExpressionEstimation.sh` as input. Otherwise, if both conditions have more than one sample each, use the `GeneExpressionLevel_unique` files from `ExpressionEstimation.sh` as input. 
+* If both conditions have only one sample each, use the ExonExpressionLevel_unique files from ExpressionEstimation.sh as input. Otherwise, if both conditions have more than one sample each, use the GeneExpressionLevel_unique files from ExpressionEstimation.sh as input. 

   * **Examples**

     DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease

-&gt; _In this case, there are two samples with the disease condition, so the `GeneExpressionLevel_unique` files from `ExpressionEstimation.sh` are used as input._
-    
+_In this case, there are two samples with the disease condition, so the GeneExpressionLevel_unique files from ExpressionEstimation.sh are used as input._
+  
+  
     DE.sh de --c1 Brain --c2 Colon --f1 Brain_whole_ExonExpressionLevel_unique.txt --f2 Colon_whole_ExonExpressionLevel_unique.txt –o TissueCompare

-&gt; _In this case, both conditions have only a single sample each, so the `ExonExpressionLevel` files from `ExpressionEstimation.sh` are used as input._
-
-### (4) Branch 4: `FileFormatConversion.sh`
-
-&gt; `FileFormatConversion_version_1.sh` will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. 
+_In this case, both conditions have only a single sample each, so the ExonExpressionLevel files from ExpressionEstimation.sh are used as input._
+
+
+### (4) Branch 4: FileFormatConversion.sh
+
+FileFormatConversion.sh will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. 

   * **Options**

@@ -302,15 +318,17 @@

     FileFormatConversion.sh -i in.sam -o out -b -r ref.fa

-&gt; _This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
-    
+_This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required._
+    
+
     FileFormatConversion.sh -i in.sam -o out –m

-&gt; _This command will convert the input in.sam file to a mrf file._
+_This command will convert the input in.sam file to a mrf file._
+

     FileFormatConversion.sh -i in.bam -o out –d

-&gt; _This command will convert the input in.bam file to a BED file for visualization._
+_This command will convert the input in.bam file to a BED file for visualization._

 ## 3\. Input files and formats

@@ -324,7 +342,7 @@
   * Reference sequences of Mitochondria 
   * Reference sequences of Ribosomal RNA 

-### (2) Possible input files for `ExpressionEstimation.sh` (Depends on the selected options)
+### (2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)

   * Genome annotation GTF file 
   * Transcriptome reference sequences 
@@ -333,39 +351,39 @@

 ### (3) Possible input files for DE.sh

-  * Output files from `ExpressionEstimation.sh`: `whole_GeneExpressionLevel_unique.txt` or `whole_ExonExpressionLevel_unique.txt` files. 
+  * Output files from `ExpressionEstimation.sh`: whole_GeneExpressionLevel_unique.txt or whole_ExonExpressionLevel_unique.txt files. 

 ### (4) Format specification of input files

-&gt; The following annotations and references should be in separate files. `RseqFlow` will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the `RseqFlow` pipeline. 
+The following annotations and references should be in separate files. RseqFlow will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the RseqFlow pipeline. 

   * **Genome Annotation GTF file**

-&gt; Format from GTF 2.0 to GTF3.0 is required; 
+Format from GTF 2.0 to GTF3.0 is required; 

   * **Transcriptome Reference Sequences**

-&gt; The transcript names must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome end location as long as there is a space separating the chromosome end from the extra info. 
-
-&gt; `“&gt;$GenomeName_$AnnotationSource_$TranscriptsID=$Chromosome:$Start-$End [extra info]”`
-
-&gt; For example, 
-
-&gt; “&amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none” 
+The transcript names must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome end location as long as there is a space separating the chromosome end from the extra info. 
+
+“&gt;$GenomeName_$AnnotationSource_$TranscriptsID=$Chromosome:$Start-$End [extra info]”
+
+For example, 
+
+“&amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none” 

   * **Genome Reference Sequences**

-&gt; The chromosome name must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome as long as there is a space separating the chromosome from the extra info. 
-
-&gt; “&amp;gt;$chromsome” 
-
-&gt; For example, 
-
-&gt; “&amp;gt;chr1 dna:chromosome” 
-
-&gt; "&amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF"
-
-&gt; “&amp;gt;chrM” 
+The chromosome name must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome as long as there is a space separating the chromosome from the extra info. 
+
+“&amp;gt;$chromsome” 
+
+For example, 
+
+“&amp;gt;chr1 dna:chromosome” 
+
+"&amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF"
+
+“&amp;gt;chrM” 

 ## 4\. Sample datasets

@@ -373,76 +391,76 @@

 We offer the sample datasets for c. elegans: 

-[(1) RNA-Seq dataset](http://code.google.com/p/rseqflow/downloads/detail?name=SRR514378.fastq.gz&amp;amp;can=2&amp;amp;q=)
-
-[(2) Reference genome sequences file](https://code.google.com/p/rseqflow/downloads/detail?name=genome_c.elegans.fa.gz&amp;amp;can=2&amp;amp;q=)
-
-[(3) Reference annotation gtf file](https://code.google.com/p/rseqflow/downloads/detail?name=c.elegans_refseq_anno.gtf&amp;amp;can=2&amp;amp;q=)
-
-[(4) Reference transcriptome sequences file](https://code.google.com/p/rseqflow/downloads/detail?name=c.elegans_refseq_seq.fa&amp;amp;can=2&amp;amp;q=)
+[(1) RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/SRR514378.fastq.gz/download)
+
+[(2) Reference genome sequences file](http://sourceforge.net/projects/rseqflow/files/genome_c.elegans.fa.gz/download)
+
+[(3) Reference annotation gtf file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_anno.gtf/download)
+
+[(4) Reference transcriptome sequences file](http://sourceforge.net/projects/rseqflow/files/c.elegans_refseq_seq.fa/download)

 (5) Commands for running three of the branches: 

-&gt; Before running the pipeline, create a new directory for output results: 
-&gt;     
-&gt;     mkdir c.elegans_Output
-
-&gt; i) Branch 1: `QC_SNP.sh`
-&gt;     
-&gt;     QC_SNP.sh -f SRR514378.fastq -g genome_c.elegans.fa -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/QC_c.elegans -p -q -s
-
-&gt; ii) Branch 2: `ExpressionEstimation.sh`
-&gt;     
-&gt;     ExpressionEstimation.sh -f SRR514378.fastq -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/ExpressionEstimation_c.elegans
-
-&gt; iii) Branch 4: `FileFormatConversion.sh`
-&gt;     
-&gt;     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -r genome_c.elegans.fa -b -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
-&gt;     
-&gt;     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_transcriptome.sam -m -o c.elegans_Output/QC_c.elegans_Bowtie2_transcriptome
-&gt;     
-&gt;     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -w -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
-&gt;     
-&gt;     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
+Before running the pipeline, create a new directory for output results: 
+    
+    mkdir c.elegans_Output
+
+i) Branch 1: QC_SNP.sh 
+    
+     QC_SNP.sh -f SRR514378.fastq -g genome_c.elegans.fa -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/QC_c.elegans -p -q -s
+
+ii) Branch 2: `ExpressionEstimation.sh`
+    
+     ExpressionEstimation.sh -f SRR514378.fastq -c c.elegans_refseq_seq.fa -a c.elegans_refseq_anno.gtf -o c.elegans_Output/ExpressionEstimation_c.elegans
+
+iii) Branch 4: `FileFormatConversion.sh`
+     
+     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -r genome_c.elegans.fa -b -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
+    
+     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_transcriptome.sam -m -o c.elegans_Output/QC_c.elegans_Bowtie2_transcriptome
+     
+     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.sam -w -o c.elegans_Output/QC_c.elegans_Bowtie2_genome
+     
+     FileFormatConversion.sh -i c.elegans_Output/QC_c.elegans_Bowtie2_genome.bam -d -o c.elegans_Output/QC_c.elegans_Bowtie2_genome

 ### (2) Human

-[(1) RNA-Seq dataset](http://code.google.com/p/rseqflow/downloads/detail?name=ERR030893_2M.fastq.gz&amp;amp;can=2&amp;amp;q=)
+[(1) RNA-Seq dataset](http://sourceforge.net/projects/rseqflow/files/ERR030893_2M.fastq.gz/download)

 [(2) Reference genome sequences file](http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz)

-[(3) Mitochondria sequence file](http://code.google.com/p/rseqflow/downloads/detail?name=Human_chrM.fa&amp;amp;can=2&amp;amp;q=)
-
-[(4) Reference annotation gtf file (gencodeV14)](http://code.google.com/p/rseqflow/downloads/detail?name=Human_gencodeV14_anno.gtf.gz&amp;amp;can=2&amp;amp;q=)
-
-[(5) Reference transcriptome sequences file (gencodeV14)](http://code.google.com/p/rseqflow/downloads/detail?name=Human_gencodeV14_transcripts.fa.gz&amp;amp;can=2&amp;amp;q=)
-
-(6) Commands for running three of the branches: 
-
-&gt; Before running the pipeline, create a new directory for output results: 
-&gt;     
-&gt;     mkdir Human_Output
-&gt; 
-&gt; Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 
-&gt;     
-&gt;     gunzip Human_gencodeV14_anno.gtf.gz
-&gt;     gunzip Human_gencodeV14_transcripts.fa.gz
-&gt;     gunzip Human_genome_GRCh37.fa.gz
-
-&gt; i) Branch 1: `QC_SNP.sh`
-&gt;     
-&gt;     QC_SNP.sh -f ERR030893_2M.fastq.gz -g Human_genome_GRCh37.fa -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf --mito Human_chrM -o Human_Output/QC_human -p -q -s
-
-&gt; ii) Branch 2: `ExpressionEstimation.sh`
-&gt;     
-&gt;     ExpressionEstimation.sh -f ERR030893_2M.fastq.gz -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf -o Human_Output/ExpressionEstimation_human
-
-&gt; iii) Branch 4: `FileFormatConversion.sh`
-&gt;     
-&gt;     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -r  Human_genome_GRCh37.fa -b -o Human_Output/QC_human_Bowtie2_genome
-&gt;     
-&gt;     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_transcriptome.sam -m -o Human_Output/QC_human_Bowtie2_transcriptome
-&gt;     
-&gt;     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -w -o Human_Output/QC_human_Bowtie2_genome
-&gt;     
-&gt;     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.bam -d -o Human_Output/QC_human_Bowtie2_genome
+[(3) Mitochondria sequence file](http://sourceforge.net/projects/rseqflow/files/Human_chrM.fa/download)
+
+[(4) Reference annotation gtf file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_anno.gtf.gz/download)
+
+[(5) Reference transcriptome sequences file (gencodeV14)](http://sourceforge.net/projects/rseqflow/files/Human_gencodeV14_transcripts.fa.gz/download)
+
+(6) Commands for running branches #1,#2, and #4: 
+
+ Before running the pipeline, create a new directory for output results: 
+     
+     mkdir Human_Output
+ 
+ Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. 
+     
+     gunzip Human_gencodeV14_anno.gtf.gz
+     gunzip Human_gencodeV14_transcripts.fa.gz
+     gunzip Human_genome_GRCh37.fa.gz
+
+ i) Branch 1: QC_SNP.sh
+     
+     QC_SNP.sh -f ERR030893_2M.fastq.gz -g Human_genome_GRCh37.fa -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf --mito Human_chrM -o Human_Output/QC_human -p -q -s
+
+ ii) Branch 2: ExpressionEstimation.sh
+     
+     ExpressionEstimation.sh -f ERR030893_2M.fastq.gz -c Human_gencodeV14_transcripts.fa -a Human_gencodeV14_anno.gtf -o Human_Output/ExpressionEstimation_human
+
+ iii) Branch 4: FileFormatConversion.sh
+     
+     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -r  Human_genome_GRCh37.fa -b -o Human_Output/QC_human_Bowtie2_genome
+     
+     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_transcriptome.sam -m -o Human_Output/QC_human_Bowtie2_transcriptome
+     
+     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.sam -w -o Human_Output/QC_human_Bowtie2_genome
+     
+     FileFormatConversion.sh -i Human_Output/QC_human_Bowtie2_genome.bam -d -o Human_Output/QC_human_Bowtie2_genome
&lt;/pre&gt;
&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">J.Herstein</dc:creator><pubDate>Wed, 29 Jan 2014 20:12:21 -0000</pubDate><guid>https://sourceforge.net732f3ba609540b09bed92162beb3975177514cb4</guid></item><item><title>UnixModeManual modified by Anonymous</title><link>https://sourceforge.net/p/rseqflow/wiki/UnixModeManual/</link><description>&lt;div class="markdown_content"&gt;&lt;h1 id="manual-of-rseqflow-unix-run-mode"&gt;Manual of &lt;code&gt;RseqFlow&lt;/code&gt; Unix Run Mode&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;Manual of RseqFlow Unix Run Mode&lt;ul&gt;
&lt;li&gt;1. Package installation and configuration&lt;/li&gt;
&lt;li&gt;2. Usage of each branch&lt;/li&gt;
&lt;li&gt;(1) Branch 1: QC_SNP.sh&lt;/li&gt;
&lt;li&gt;(2) Branch 2: ExpressionEstimation.sh&lt;/li&gt;
&lt;li&gt;(3) Branch 3: DE.sh&lt;/li&gt;
&lt;li&gt;(4) Branch 4: FileFormatConversion.sh&lt;/li&gt;
&lt;li&gt;3. Input files and formats&lt;/li&gt;
&lt;li&gt;(1) Possible input files for QC_SNP.sh (Depends on the selected options)&lt;/li&gt;
&lt;li&gt;(2) Possible input files for ExpressionEstimation.sh (Depends on the selected options)&lt;/li&gt;
&lt;li&gt;(3) Possible input files for DE.sh&lt;/li&gt;
&lt;li&gt;(4) Format specification of input files&lt;/li&gt;
&lt;li&gt;4. Sample datasets&lt;/li&gt;
&lt;li&gt;(1) c. elegans&lt;/li&gt;
&lt;li&gt;(2) Human&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="146-package-installation-and-configuration"&gt;1. Package installation and configuration&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Pre-install run environment&lt;/strong&gt;&lt;ol&gt;
&lt;li&gt;UNIX; &lt;/li&gt;
&lt;li&gt;Python 2.7 and higher; &lt;/li&gt;
&lt;li&gt;R 2.11 and higher; &lt;/li&gt;
&lt;li&gt;GCC &lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Step 1: Download the &lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=RseqFlow_source.tar.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;source code&lt;/a&gt; to your directory, e.g '/home/user/rseqflow'. &lt;/p&gt;
&lt;p&gt;Step 2: Enter your specified directory: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;cd&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;home&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;rseqflow&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 3: Extract the tar file: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;tar&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;xvf&lt;/span&gt; &lt;span class="n"&gt;RseqFlow_source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 4: Enter the directory: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;cd&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;home&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;reseqflow&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;RseqFlow_source&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 5: Set up some tools: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;make&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 6: Set up PATH so that the system knows where to find the executable files. It is recommended to run step 7 after this to make these changes permanent, otherwise you will need to run ./configure.sh each time you run &lt;code&gt;RseqFlow&lt;/code&gt; from a new terminal window. &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 7: This step will make the changes to your PATH and PYTHONPATH variables from step 6 permanent. If you choose not to do this step, you will need to run configure.sh each time you run &lt;code&gt;RseqFlow&lt;/code&gt;. To make the PATH and PYTHONPATH changes permanent, copy the commands in 'configure.sh' into your bash file either manually or with the following command. &lt;strong&gt;Please make sure the command contains "&amp;gt;&amp;gt;", not “&amp;gt;", otherwise, you will overwrite your original bash file! &lt;/strong&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="n"&gt;configure&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;home&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bashrc&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 8: If your system has multiple python versions, make sure you use version 2.7 or higher. Run 'pythonCompilerSet.sh' to set the proper python header in each of the python scripts. The example below assumes the path of the python executable is '/home/user/python2.7'. &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;  &lt;span class="n"&gt;bash&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;3.2&lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;pythonCompilerSet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;home&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;python2&lt;/span&gt;&lt;span class="mf"&gt;.7&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;python&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Step 9: Now you can run the corresponding shell scripts for each branch. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="246-usage-of-each-branch"&gt;2. Usage of each branch&lt;/h2&gt;
&lt;p&gt;The pipeline consists of four branches: &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Branch 1: Quality Control and SNP calling &lt;/li&gt;
&lt;li&gt;Branch 2: Expression level estimation &lt;/li&gt;
&lt;li&gt;Branch 3: Differentially expressed gene identification &lt;/li&gt;
&lt;li&gt;Branch 4: Alignment file format conversion &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="1-branch-1-qc_snpsh"&gt;(1) Branch 1: &lt;code&gt;QC_SNP.sh&lt;/code&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;QC_SNP.sh will implement quality control and/or SNP calling analysis for single end and paired end RNA-Seq datasets. &lt;/p&gt;
&lt;p&gt;For pre-alignment quality control analysis only, a fastq(.gz) file is required as input, either a single fastq for single end data or Read1.fastq and Read2.fastq for paired end data. &lt;/p&gt;
&lt;p&gt;For post-alignment quality control analysis, the fastq files(s), genome reference sequences, transcriptome reference sequences, and annotation file must be input. QC_SNP.sh will align the RNA-Seq dataset to the genome and transcriptome, then merge the alignment results for quality control and SNP calling. &lt;/p&gt;
&lt;p&gt;You may input ribosomal RNA reference sequences and/or mitochondrial reference sequences if desired. &lt;/p&gt;
&lt;p&gt;If bowtie2 indexes already exist in the same directory as the genome and/or transcriptome reference files, &lt;code&gt;RseqFlow&lt;/code&gt; will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference files for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the corresponding genome/transcriptome reference. This will bypass index creation for future runs and reduce run times. &lt;/p&gt;
&lt;p&gt;If you already have alignment file(s) in SAM format for the genome and/or transcriptome, you may input the .sam files and the alignment step will be skipped. The post-alignment and/or SNP calling will be analyzed based on the merged result only. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Options&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Options &lt;br /&gt;
arguments &lt;br /&gt;
Meaning &lt;/p&gt;
&lt;p&gt;-f/--fastq &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
RNA-Seq dataset with single end reads in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-1/--read1 &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
The first reads file for paired end data in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-2/--read2 &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
The second reads file for paired end data in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-g/--genome &lt;br /&gt;
.fa &lt;br /&gt;
Genome reference sequences in FASTA format &lt;/p&gt;
&lt;p&gt;-c/-transcriptome &lt;br /&gt;
.fa &lt;br /&gt;
Transcriptome reference sequences in FASTA format &lt;/p&gt;
&lt;p&gt;-a/--annotation &lt;br /&gt;
.gtf &lt;br /&gt;
Reference annotation in GTF format &lt;/p&gt;
&lt;p&gt;-o/--output-prefix &lt;br /&gt;
prefix &lt;br /&gt;
Prefix of output files (default setting is “QC_SNP_output”) &lt;/p&gt;
&lt;p&gt;-p/--pre-QC &lt;br /&gt;
Generate pre-alignment QC reports based only on the RNA-Seq iput fastq file &lt;/p&gt;
&lt;p&gt;-q/--QC &lt;br /&gt;
Generate post-QC reports based on alignment &lt;/p&gt;
&lt;p&gt;-s/--SNP &lt;br /&gt;
Implement SNPs calling analysis &lt;/p&gt;
&lt;p&gt;--ribo &lt;br /&gt;
.fa &lt;br /&gt;
Reference sequences of ribosomal RNA in FASTA format &lt;/p&gt;
&lt;p&gt;--mito &lt;br /&gt;
.fa &lt;br /&gt;
Reference sequences of Mitochondrial chromosome in FASTA format &lt;/p&gt;
&lt;p&gt;--gSam &lt;br /&gt;
.sam &lt;br /&gt;
Alignment to genome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to genome) &lt;/p&gt;
&lt;p&gt;--tSam &lt;br /&gt;
.sam &lt;br /&gt;
Alignment to transcriptome in SAM format (This option is for cases where RNA-Seq dataset has already been aligned to transcriptome) &lt;/p&gt;
&lt;p&gt;--cleanup &lt;br /&gt;
Delete temporary files (default setting is to save all files) &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;QC_SNP.sh -f Reads.fastq -o OutputPrefix –p --cleanup&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will give pre-alignment quality analysis only, based on the RNA-Seq dataset (Reads.fq) and will delete temporary files.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;Reads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="n"&gt;RefGene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;RefTran&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;OutputPrefix&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;ribo&lt;/span&gt; &lt;span class="n"&gt;rRNA&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will align the Reads.fastq.gz to the genome (&lt;code&gt;RefGene.fa&lt;/code&gt;), transcriptome (&lt;code&gt;RefTran.fa&lt;/code&gt;) and ribosomal RNA (rRNA.fa) and implement the pre and post alignment quality control analyses. It will also output the ribosomal RNA alignment report.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;OutputPrefix&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;gSam&lt;/span&gt; &lt;span class="n"&gt;AlignedToGenome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;tSam&lt;/span&gt; &lt;span class="n"&gt;AlignedToTran&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will give post-alignment quality analysis based on the user specified alignment SAM files to the genome (&lt;code&gt;AlignedToGenome.sam&lt;/code&gt;) and transcriptome(&lt;code&gt;AlignedToTran.sam&lt;/code&gt;).&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;Read1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;Read2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="n"&gt;RefGene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;RefTran&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;OutputPrefix&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will align the paired end RNA-Seq datasets Read1.fastq and Read2.fastq to the genome (&lt;code&gt;RefGene.fa&lt;/code&gt;) and transcriptome (&lt;code&gt;RefTran.fa&lt;/code&gt;) and implement the pre and post alignment quality control analyses and SNP calling.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="n"&gt;Read1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="n"&gt;Read2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="n"&gt;RefGene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;RefTran&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;OutputPrefix&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;mito&lt;/span&gt; &lt;span class="n"&gt;chrM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will align the paired end Read1.fastq.gz and Read2.fastq.gz to the genome (&lt;code&gt;RefGene.fa&lt;/code&gt;), transcriptome (&lt;code&gt;RefTran.fa&lt;/code&gt;) and Mitochondrial chromosome (chrM.fa) and implement SNP calling base on the merged genome and transcriptome alignments.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="2-branch-2-expressionestimationsh"&gt;(2) Branch 2: &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; will implement expression level estimation for gene/exon/splice junctions based on the alignment to the transcriptome for single end or paired end RNA-Seq datsets. Beginning in version 2.1 and later, only unique alignments are used in expression estimation. Prior to version 2.1, the best alignment for each read was used which may or may not have been unique. &lt;/p&gt;
&lt;p&gt;If bowtie2 indexes already exist in the same directory as the transcriptome reference file, &lt;code&gt;RseqFlow&lt;/code&gt; will use those existing indexes. If it cannot find precomputed indexes, it will create the indexes in the output directory. If you plan on using the same reference file for future runs, you may wish to move the files ending in .bt2 from the output directory into the directory of the transcriptome reference. This will bypass index creation for future runs and reduce run times. &lt;/p&gt;
&lt;p&gt;If a fastq file and the reference sequences of the transcriptome are supplied as input, &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; will align the RNA-Seq dataset to the transcriptome and then implement expression level estimation. &lt;/p&gt;
&lt;p&gt;If you already have an alignment file to the transcriptome, you may input the .sam file and the alignment step will be skipped. &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; will implement expression level estimation. &lt;/p&gt;
&lt;p&gt;Please note that &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; is expecting a bowtie2 created samfile. If your pre-existing samfile was not created with bowtie2, it is strongly suggested that you run &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; without a samfile to let &lt;code&gt;RseqFlow&lt;/code&gt; create its own, otherwise there is no guarantee that &lt;code&gt;ExpressionEstimation&lt;/code&gt; will use only unique alignments. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Options&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Options &lt;br /&gt;
arguments &lt;br /&gt;
Meaning &lt;/p&gt;
&lt;p&gt;-f/--fastq &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
The reads file for single end data in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-1/--read1 &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
The first reads file for paired end data in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-2/--read2 &lt;br /&gt;
.fastq or .fastq.gz &lt;br /&gt;
The second reads file for paired end data in FASTQ format (fastq or fastq.gz) &lt;/p&gt;
&lt;p&gt;-c/-transcriptome &lt;br /&gt;
.fa &lt;br /&gt;
Transcriptome reference sequences in FASTA format &lt;/p&gt;
&lt;p&gt;-a/--annotation &lt;br /&gt;
.gtf &lt;br /&gt;
Reference annotation in GTF format &lt;/p&gt;
&lt;p&gt;-o/--output-prefix &lt;br /&gt;
prefix &lt;br /&gt;
Prefix of output files (default is Expression_output) &lt;/p&gt;
&lt;p&gt;--tSam &lt;br /&gt;
.sam &lt;br /&gt;
Alignments to the transcriptome in SAM format &lt;/p&gt;
&lt;p&gt;--cleanup &lt;br /&gt;
Delete temporary files (default setting is to save all files) &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;ExpressionEstimation.sh -f Reads.fastq -c RefTran.fa -a Anno.gtf -o OutputPrefix --cleanup&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will align the single end Reads.fastq to the transcriptome reference sequence (&lt;code&gt;RefTran.fa&lt;/code&gt;) and implement the expression level estimation using the distinct best alignment for each read. Temporary files will be deleted.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;ExpressionEstimation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;OutputPrefix&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;tSam&lt;/span&gt; &lt;span class="n"&gt;AlignedToTran&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command uses the supplied .sam file that has already been aligned to the transcriptome. &lt;code&gt;RseqFlow&lt;/code&gt; will implement the expression level estimation from the distinct best alignment for each read.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="3-branch-3-desh"&gt;(3) Branch 3: &lt;code&gt;DE.sh&lt;/code&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;The de command will identify the differentially expressed genes for two conditions (e.g. case/control) using the output files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt;. If both conditions have only one sample, the &lt;code&gt;ExonExpressionLevel_unique&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; are used as input. Otherwise, if either condition has more than one sample, the &lt;code&gt;GeneExpressionLevel_unique&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; are used as input. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Options&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;1) DE.sh de &lt;code&gt;[options]&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Options &lt;br /&gt;
arguments &lt;br /&gt;
Meaning &lt;/p&gt;
&lt;p&gt;--c1 &lt;br /&gt;
Condition1_ID &lt;br /&gt;
ID for condition1 (e.g. Control) &lt;/p&gt;
&lt;p&gt;--c2 &lt;br /&gt;
Condition2_ID &lt;br /&gt;
ID for condition2 (e.g. Case) &lt;/p&gt;
&lt;p&gt;-1/--f1 &lt;br /&gt;
Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition1&lt;code&gt;*&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;-2/--f2 &lt;br /&gt;
Comma separated file name list for condition1. The files should be the Gene/Exon expression file(s) for condition2&lt;code&gt;*&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;-o/--output-prefix &lt;br /&gt;
prefix &lt;br /&gt;
Prefix of output files &lt;/p&gt;
&lt;p&gt;&lt;code&gt;*&lt;/code&gt; If both conditions have only one sample each, use the &lt;code&gt;ExonExpressionLevel_unique&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; as input. Otherwise, if both conditions have more than one sample each, use the &lt;code&gt;GeneExpressionLevel_unique&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; as input. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;DE.sh de --c1 disease --c2 normal --f1 disease_S1_whole_GeneExpressionLevel_unique.txt, disease_S2_whole_GeneExpressionLevel_unique.txt --f2 normal_S1_whole_GeneExpressionLevel_unique.txt -o HeartDisease&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In this case, there are two samples with the disease condition, so the &lt;code&gt;GeneExpressionLevel_unique&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; are used as input.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;DE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="n"&gt;de&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;c1&lt;/span&gt; &lt;span class="n"&gt;Brain&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;c2&lt;/span&gt; &lt;span class="n"&gt;Colon&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;f1&lt;/span&gt; &lt;span class="n"&gt;Brain_whole_ExonExpressionLevel_unique&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;f2&lt;/span&gt; &lt;span class="n"&gt;Colon_whole_ExonExpressionLevel_unique&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;txt&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;TissueCompare&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In this case, both conditions have only a single sample each, so the &lt;code&gt;ExonExpressionLevel&lt;/code&gt; files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; are used as input.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3 id="4-branch-4-fileformatconversionsh"&gt;(4) Branch 4: &lt;code&gt;FileFormatConversion.sh&lt;/code&gt;&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;FileFormatConversion_version_1.sh&lt;/code&gt; will convert between alignment file formats for storage or visualization convenience. It can convert SAM to BAM, MRF and WIG, BAM to BED, and MRF to WIG. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Options&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Options &lt;br /&gt;
arguments &lt;br /&gt;
Meaning &lt;/p&gt;
&lt;p&gt;-i/--input &lt;br /&gt;
File to convert &lt;br /&gt;
Input file to convert, with the correct file suffix (.sam, .bam, or .mrf) &lt;/p&gt;
&lt;p&gt;-o/--output &lt;br /&gt;
prefix &lt;br /&gt;
Prefix of output files &lt;/p&gt;
&lt;p&gt;-r/--reference &lt;br /&gt;
Reference sequence file that was used to align the input file &lt;br /&gt;
If a file is being converted from SAM to BAM format, the .sam file should contain either the appropriate @SQ headers or you need to use the --reference option to supply the reference file that was used to create the .sam file. &lt;/p&gt;
&lt;p&gt;-b/--toBAM &lt;br /&gt;
Convert SAM to BAM &lt;/p&gt;
&lt;p&gt;-m/--toMRF &lt;br /&gt;
Convert SAM to MRF &lt;/p&gt;
&lt;p&gt;-d/--toBED &lt;br /&gt;
Convert BAM to BED &lt;/p&gt;
&lt;p&gt;-w/--toWIG &lt;br /&gt;
Convert SAM to WIG or MRF to WIG &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;FileFormatConversion.sh -i in.sam -o out -b -r ref.fa&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will convert the input in.sam file to a bam file. If the sam file does not contain the appropriate "@SQ" header lines, the reference sequences file (ref.fa) is required.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will convert the input in.sam file to a mrf file.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This command will convert the input in.bam file to a BED file for visualization.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="346-input-files-and-formats"&gt;3. Input files and formats&lt;/h2&gt;
&lt;h3 id="1-possible-input-files-for-qc_snpsh-depends-on-the-selected-options"&gt;(1) Possible input files for QC_SNP.sh (Depends on the selected options)&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Genome annotation GTF file &lt;/li&gt;
&lt;li&gt;Transcriptome reference sequences &lt;/li&gt;
&lt;li&gt;Genome reference seqeuences &lt;/li&gt;
&lt;li&gt;RNA-Seq fastq or fastq.gz file &lt;/li&gt;
&lt;li&gt;Alignment files in SAM format &lt;/li&gt;
&lt;li&gt;Reference sequences of Mitochondria &lt;/li&gt;
&lt;li&gt;Reference sequences of Ribosomal RNA &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="2-possible-input-files-for-expressionestimationsh-depends-on-the-selected-options"&gt;(2) Possible input files for &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt; (Depends on the selected options)&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Genome annotation GTF file &lt;/li&gt;
&lt;li&gt;Transcriptome reference sequences &lt;/li&gt;
&lt;li&gt;RNA-Seq fastq or fastq.gz file &lt;/li&gt;
&lt;li&gt;Alignment files in SAM format &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="3-possible-input-files-for-desh"&gt;(3) Possible input files for DE.sh&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Output files from &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt;: &lt;code&gt;whole_GeneExpressionLevel_unique.txt&lt;/code&gt; or &lt;code&gt;whole_ExonExpressionLevel_unique.txt&lt;/code&gt; files. &lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="4-format-specification-of-input-files"&gt;(4) Format specification of input files&lt;/h3&gt;
&lt;blockquote&gt;
&lt;p&gt;The following annotations and references should be in separate files. &lt;code&gt;RseqFlow&lt;/code&gt; will automatically split the files during processing, if necessary. All eukaryotic species with files in the required formats can be analyzed in the &lt;code&gt;RseqFlow&lt;/code&gt; pipeline. &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Genome Annotation GTF file&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Format from GTF 2.0 to GTF3.0 is required; &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Transcriptome Reference Sequences&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;The transcript names must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome end location as long as there is a space separating the chromosome end from the extra info. &lt;/p&gt;
&lt;p&gt;&lt;code&gt;“&amp;gt;$GenomeName_$AnnotationSource_$TranscriptsID=$Chromosome:$Start-$End [extra info]”&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;For example, &lt;/p&gt;
&lt;p&gt;“&amp;gt;hg19_wgEncodeGencodeManualV4_ENST00000480075=chr7:19757-35457 5'pad=0 3'pad=0 strand=- repeatMasking=none” &lt;/p&gt;
&lt;/blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Genome Reference Sequences&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;The chromosome name must begin with “&amp;gt;” and should meet the format requirements below. Extra information may follow the chromosome as long as there is a space separating the chromosome from the extra info. &lt;/p&gt;
&lt;p&gt;“&amp;gt;$chromsome” &lt;/p&gt;
&lt;p&gt;For example, &lt;/p&gt;
&lt;p&gt;“&amp;gt;chr1 dna:chromosome” &lt;/p&gt;
&lt;p&gt;"&amp;gt;chr21 dna:chromosome chromosome:GRCh37:21:1:48129895:1 REF"&lt;/p&gt;
&lt;p&gt;“&amp;gt;chrM” &lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="446-sample-datasets"&gt;4. Sample datasets&lt;/h2&gt;
&lt;h3 id="1-c-elegans"&gt;(1) c. elegans&lt;/h3&gt;
&lt;p&gt;We offer the sample datasets for c. elegans: &lt;/p&gt;
&lt;p&gt;&lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=SRR514378.fastq.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(1) RNA-Seq dataset&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="https://code.google.com/p/rseqflow/downloads/detail?name=genome_c.elegans.fa.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(2) Reference genome sequences file&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="https://code.google.com/p/rseqflow/downloads/detail?name=c.elegans_refseq_anno.gtf&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(3) Reference annotation gtf file&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="https://code.google.com/p/rseqflow/downloads/detail?name=c.elegans_refseq_seq.fa&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(4) Reference transcriptome sequences file&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(5) Commands for running three of the branches: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Before running the pipeline, create a new directory for output results: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;mkdir&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;i) Branch 1: &lt;code&gt;QC_SNP.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;SRR514378&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="n"&gt;genome_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_refseq_seq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_refseq_anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;ii) Branch 2: &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;ExpressionEstimation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;SRR514378&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_refseq_seq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_refseq_anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ExpressionEstimation_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;iii) Branch 4: &lt;code&gt;FileFormatConversion.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="n"&gt;genome_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_transcriptome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_transcriptome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elegans_Bowtie2_genome&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/blockquote&gt;
&lt;h3 id="2-human"&gt;(2) Human&lt;/h3&gt;
&lt;p&gt;&lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=ERR030893_2M.fastq.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(1) RNA-Seq dataset&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="http://genomics.isi.edu/downloads/Human_genome_GRCh37_69.fa.gz" rel="nofollow"&gt;(2) Reference genome sequences file&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=Human_chrM.fa&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(3) Mitochondria sequence file&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=Human_gencodeV14_anno.gtf.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(4) Reference annotation gtf file (gencodeV14)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="" href="http://code.google.com/p/rseqflow/downloads/detail?name=Human_gencodeV14_transcripts.fa.gz&amp;amp;can=2&amp;amp;q=" rel="nofollow"&gt;(5) Reference transcriptome sequences file (gencodeV14)&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(6) Commands for running three of the branches: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Before running the pipeline, create a new directory for output results: &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;mkdir&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Unzip the reference files. You may wish to create a separate directory so that you can use these references for future runs. &lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;gunzip&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt;
&lt;span class="n"&gt;gunzip&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_transcripts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt;
&lt;span class="n"&gt;gunzip&lt;/span&gt; &lt;span class="n"&gt;Human_genome_GRCh37&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;i) Branch 1: &lt;code&gt;QC_SNP.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;QC_SNP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;ERR030893_2M&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="n"&gt;Human_genome_GRCh37&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_transcripts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="n"&gt;mito&lt;/span&gt; &lt;span class="n"&gt;Human_chrM&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;ii) Branch 2: &lt;code&gt;ExpressionEstimation.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;ExpressionEstimation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="n"&gt;ERR030893_2M&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fastq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gz&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_transcripts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Human_gencodeV14_anno&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;gtf&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;ExpressionEstimation_human&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;iii) Branch 4: &lt;code&gt;FileFormatConversion.sh&lt;/code&gt;&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;  &lt;span class="n"&gt;Human_genome_GRCh37&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fa&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_transcriptome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_transcriptome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;

&lt;span class="n"&gt;FileFormatConversion&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bam&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="n"&gt;Human_Output&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;QC_human_Bowtie2_genome&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/blockquote&gt;&lt;/div&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anonymous</dc:creator><pubDate>Wed, 29 Jan 2014 18:26:42 -0000</pubDate><guid>https://sourceforge.nete8144cf92730ea0b9f755db95ea160d65b75a12c</guid></item></channel></rss>