Hi, Andrew:
Thank you for developing deFuse. I am a bioinformatics student and working
on fusion detection using RNA-seq. I built a super naive fusion simulator
and tried to estimate the sensitivity and FDR between existing methods.
However, when I compared deFuse result to my simulated fusion breakpoints,
deFuse could never find the breakpoint correctly, the closest were at least
30bp away from the true breakpoint when the partner was found to be
correct.
The reason why I called it naive is that I just get two random exons and
use their exon boundaries as the breakpoint and randomly generate fusion
supporting reads (span and split) from reference sequence. I blat those
reads to check their reliability and blat can find the breakpoint
correctly. At the end, I merged these simulated reads with all properly
paired aligned reads as the background to build a dataset to run deFuse.
As a result, I want to ask your opinion about this. Is this issue caused by
missing important factors deFuse considered but my simulator did not, so
that I need to add to my simulator? Or is it just a small bug of deFuse? In
my opinion, if it is really a fusion, the detection algorithm should be
able to find it.
I can set you my simulated reads to you, but it is 35M and too big to be
sent by email.
Thank you
Yuxiang Tan
Last edit: yuxiang tan 2014-02-08
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, Andrew:
Thank you for developing deFuse. I am a bioinformatics student and working
on fusion detection using RNA-seq. I built a super naive fusion simulator
and tried to estimate the sensitivity and FDR between existing methods.
However, when I compared deFuse result to my simulated fusion breakpoints,
deFuse could never find the breakpoint correctly, the closest were at least
30bp away from the true breakpoint when the partner was found to be
correct.
The reason why I called it naive is that I just get two random exons and
use their exon boundaries as the breakpoint and randomly generate fusion
supporting reads (span and split) from reference sequence. I blat those
reads to check their reliability and blat can find the breakpoint
correctly. At the end, I merged these simulated reads with all properly
paired aligned reads as the background to build a dataset to run deFuse.
As a result, I want to ask your opinion about this. Is this issue caused by
missing important factors deFuse considered but my simulator did not, so
that I need to add to my simulator? Or is it just a small bug of deFuse? In
my opinion, if it is really a fusion, the detection algorithm should be
able to find it.
I can set you my simulated reads to you, but it is 35M and too big to be
sent by email.
Thank you
Yuxiang Tan
Last edit: yuxiang tan 2014-02-08