Menu

#2 mirpro_feature_pro bug

1.0
closed
None
2015-09-30
2015-09-20
liangc_mu
No

A potential bug for mirpro_feature_pro. Here is the reason and my reasoning:

I have one sam file named "9.gsnap.sam", then, I used samtools to get its bam files: "9.gsnap.bam" and "9.gsnap.sorted.bam".

Here is what I did:

(1)
htseq-count -f bam -s no -a 10 9.gsnap.sorted.bam ../ensembl/release-81/Gallus_gallus.Galgal4.81.gtf > 9.ht_count

(2)
mirpro_feature_pro -i 9.gsnap.sam -f sam -s no -a 10 -t ../ensembl/release-81/Gallus_gallus.Galgal4.81.gtf -o 9.mirpro_count

(3)
mirpro_feature_pro -i 9.gsnap.bam -f bam -s no -a 10 -t ../ensembl/release-81/Gallus_gallus.Galgal4.81.gtf -o 9.mirpro_bam_count

(4)
mirpro_feature_pro -i 9.gsnap.sorted.bam -f bam -s no -a 10 -t ../ensembl/release-81/Gallus_gallus.Galgal4.81.gtf -o 9.mirpro_sortedbam_count

What is interesting is that the file I got from (1) is identical to all three files with the name "htseq_count_output" in all of these folders. However, the result.individual.csv and result.summary.csv are slight different. So, I think mipro_feature_pro has a bug somewhere. But, I am not sure where is the bug.

Discussion

  • liangc_mu

    liangc_mu - 2015-09-30
    • status: open --> closed
     
  • liangc_mu

    liangc_mu - 2015-09-30

    We found that some aligners (e.g., GSNAP) will tag a read with "NH:i:1" even this read has a transplicing mapping (e.g., half of the read is mapped to one locus whereas the second half is mapped to another locus). In mirpro_feature_pro of mirPRo, we will treat this read as aligned, but not uniquely mapped read. Based on HTSeq package, mirpro_feature_pro can be used for mapped read cataloging for both RNA-Seq data and small RNA-Seq data. We believe this is more accurate way for read cataloging.

     

Log in to post a comment.

MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.