Dear Simon Anders,
I am working with a non-model organism and have used Trinity to make the de novo transcriptome (which is essentially a flat file with ~620.000 transcripts in this case). Now, I need the counts for downstream analyses. Well, Trinity generates a counts matrix using RSEM. However, this is not a matrix of integers as, say, edgeR and/or DESeq requires (if I get this right).
So, I would like to generate the counts matrix from the experiment and would like to inquire if this sounds is a 'htseq-counts'-problem easily solved, given the files that I have, i.e., 1) the fasta 'reference' transcriptome and 2) the PE read files.
Thank you.
Regards
If you have aligned to a transcriptome, the each gene is a different reference sequence, right? So, you only need to look at the RNAME field in the SAM file and ignore RPOS altogether. This does not look like a problem for htseq-count. You can, of course, use HTSeq to write your own counting script, but maybe some script-fu already does the job.
BTW, I hope you have though about how to resolve ambiguities due to isoforms.
Dear Simon,
If I interpret you correctly, than this is not a problem that HTSeq would solve as it is now.
I have been made aware that the corset-project has built a tool for my case.
Anyway, thanks for the feedback.
jahn
Fra: Simon Anders [mailto:sanders_muc@users.sf.net]
Sendt: 5. januar 2014 00:05
Til: [htseq:support-requests]
Emne: [htseq:support-requests] #29 making the count matrix from a fasta reference
If you have aligned to a transcriptome, the each gene is a different reference sequence, right? So, you only need to look at the RNAME field in the SAM file and ignore RPOS altogether. This does not look like a problem for htseq-count. You can, of course, use HTSeq to write your own counting script, but maybe some script-fu already does the job.
BTW, I hope you have though about how to resolve ambiguities due to isoforms.
[support-requests:#29]http://sourceforge.net/p/htseq/support-requests/29/ making the count matrix from a fasta reference
Status: open
Created: Sat Dec 21, 2013 06:16 AM UTC by JahnDavik
Last Updated: Sat Dec 21, 2013 06:16 AM UTC
Owner: nobody
Dear Simon Anders,
I am working with a non-model organism and have used Trinity to make the de novo transcriptome (which is essentially a flat file with ~620.000 transcripts in this case). Now, I need the counts for downstream analyses. Well, Trinity generates a counts matrix using RSEM. However, this is not a matrix of integers as, say, edgeR and/or DESeq requires (if I get this right).
So, I would like to generate the counts matrix from the experiment and would like to inquire if this sounds is a 'htseq-counts'-problem easily solved, given the files that I have, i.e., 1) the fasta 'reference' transcriptome and 2) the PE read files.
Thank you.
Regards
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/htseq/support-requests/29/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/