Home
Name Modified Size InfoDownloads / Week
readme.txt 2014-06-29 2.2 kB
how_to_use_CollapseFPKM.mp4 2014-05-18 3.5 MB
temp3_genes.fpkm_tracking 2014-05-18 470 Bytes
temp1_genes.fpkm_tracking 2014-05-18 470 Bytes
temp2_genes.fpkm_tracking 2014-05-18 470 Bytes
CollapseFPKM.pl 2014-05-18 4.6 kB
CollapseFPKM.sh 2014-05-18 2.3 kB
Totals: 7 Items   3.5 MB 0
####################
Website: https://sourceforge.net/projects/collapsefpkm/
Updated on: June 29, 2014


This code is a solution to collapsing duplicate FPKMs for a gene.

Problem/Issue:
In the cufflinks output files *_genes.expr (which reports the gene-level coordinates and expression values), sometimes I get more than one row for the same gene? It's like in some cases the FPKM values from the transcripts corresponding to the same gene do not get summed, although the transcripts are assigned to the same gene.

Reasons and Solution:
The multiple FPKM problem occurs when genes have transcripts that do not overlap with any other transcripts in the gene. For example, this occurs in the ENSG00000125388 gene from ENSEMBL/hg19. We are aware of this issue and will eventually change the behavior, but for now a simple solution is just to sum the FPKMs since the gene FPKMs are just the sum of the transcript FPKMs anyways.  

The details on  this is discussed at http://seqanswers.com/forums/showthread.php?t=5224


How to check if there are duplicates. Check the following numbers are identical:
 awk '{print $1}' ./xxx/genes.fpkm_tracking | wc -l 
 awk '{print $1}' ./xxx/genes.fpkm_tracking | sort | uniq | wc -l

####################
# Original version (named collapse.pl, collapse.sh) was coded by Madelaine Gogol 
#
#    collapse.pl - for a genes.expr file from cufflinks... 
#         collapse genes with multiple lines into one, summing the FPKMS. 
#    By Madelaine Gogol 
#    9/2010
#    Reference:  http://seqanswers.com/forums/showthread.php?t=5224
#    Original version:  http://research.stowers.org/mcm/collapse.tar.gz
# 
####################
# New version (named CollapseFPKM.pl, CollapseFPKM.sh ) was coded by Aimin Li
#
#    Version v1.0
#    May 2014
#    Tested on output files of Cufflinks v2.0.2 
#    liaiminmail@gmail.com
#    New version:  https://sourceforge.net/p/collapsefpkm/
#
####################
# HOW TO USE
#
#    step 1: download CollapseFPKM.pl, CollapseFPKM.sh
#    step 2: chmod u+x CollapseFPKM.sh
#    step 3: ./CollapseFPKM.sh folder_of_[genes.fpkm_tracking]_files
#    step 4: view outputs: *.collapse
#
####################
Source: readme.txt, updated 2014-06-29