Thank you for the tool.
I have been using for fusion finding.I have a question from manual
splitr span pvalue p-value, lower values are evidence the prediction is a false positive
What does it really mean.
This is from the Manuel What does it mean .I should decide a cut-off above 0.05 ???
And for some fusion gens I could see the read counts exceeds more than the actual number of reads in total sample.(I mean more than the reads number in fasta file).I assume its due to multimapping..?
But how..I am not clear
I am considering results.filtered.tsv as my output file and extracting results(fusion genes) from there. the cut off are now span and split counts.
It would nice if you explain the p_value here and a little bit multimapping(how is it considered)
Thank you
/A
Since it states the lower value is false positive
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Also I have question related to this one.In the supplementary S1 there is a column name classification.Which I cannot find in any of the results.*.tsv files from the output directory of deFuse.
May I know where can I find it the adaboost classification for TRUE or FALSE fusion types.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
That information may be out of date. The classification is just a threshold on the classifier probability and can be easily regenerated based on the threshold of your choosing.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear Andrew,
Thank you for the tool.
I have been using for fusion finding.I have a question from manual
splitr span pvalue p-value, lower values are evidence the prediction is a false positive
What does it really mean.
This is from the Manuel What does it mean .I should decide a cut-off above 0.05 ???
And for some fusion gens I could see the read counts exceeds more than the actual number of reads in total sample.(I mean more than the reads number in fasta file).I assume its due to multimapping..?
But how..I am not clear
I am considering results.filtered.tsv as my output file and extracting results(fusion genes) from there. the cut off are now span and split counts.
It would nice if you explain the p_value here and a little bit multimapping(how is it considered)
Thank you
/A
Since it states the lower value is false positive
For a more in-depth explanation of those p-values please refer to the section from the defuse supplementary methods:
4.7.2 Split position p-value and minimum split anchor p-value
and the section from the defuse paper's main text:
Corroborating spanning read and split read evidence.
Ok Thank you I cannot find the Manuel in the usual site.Could please paste the link here.(defuse supplementary methods)
Thank you
See supplementary text S1 on the plos comp bio site:
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1001138#s5
Also I have question related to this one.In the supplementary S1 there is a column name classification.Which I cannot find in any of the results.*.tsv files from the output directory of deFuse.
May I know where can I find it the adaboost classification for TRUE or FALSE fusion types.
That information may be out of date. The classification is just a threshold on the classifier probability and can be easily regenerated based on the threshold of your choosing.