transdecoder-users Mailing List for TranscriptDecoder (Page 2)
Extracting likely coding regions from transcript sequences
Brought to you by:
bhaas
You can subscribe to this list here.
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(8) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2014 |
Jan
(12) |
Feb
(14) |
Mar
(4) |
Apr
(8) |
May
(17) |
Jun
(14) |
Jul
(21) |
Aug
(8) |
Sep
(5) |
Oct
(8) |
Nov
(1) |
Dec
(1) |
2015 |
Jan
(9) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jon L. <jon...@gm...> - 2014-08-20 09:29:40
|
Hi Brian Actually, apologies, it looks like Transdecoder is working fine, a trans spliced gene had caused some problems in the transcriptome build, generating a very large erroneous transcript sequence,, Thanks and best wishes Jon On Tue, Aug 19, 2014 at 12:48 PM, Brian Haas <bh...@br...> wrote: > OK - I look forward to learning more. > > thx, > > ~brian > > > > On Tue, Aug 19, 2014 at 7:45 AM, Jon Lees <jon...@gm...> wrote: > >> Hi Brian >> Yes I just tried that, and it looks like the documentation is up to date, >> but its silently crashing at some point. >> >> Ive just started trying to debug to find out at which point its stops >> running >> will let you know if I find anything >> >> thanks and >> best wishes >> >> Jon >> >> >> >> >> On Tue, Aug 19, 2014 at 12:35 PM, Brian Haas <bh...@br...> >> wrote: >> >>> HI Jon, >>> >>> The documentation could be out of date. Can you try running the sample >>> data set through and see if it generates the expected output files? >>> >>> cd sample_data/ >>> ./runMe.sh >>> >>> best, >>> >>> ~brian >>> >>> >>> >>> >>> On Tue, Aug 19, 2014 at 6:27 AM, Jon Lees <jon...@gm...> wrote: >>> >>>> Hi >>>> >>>> Ive run transdecoder a couple of times now, >>>> >>>> However it only generates the temporary folder >>>> >>>> with the three files: >>>> longest_orfs.pep : all ORFs meeting the minimum length criteria, >>>> regardless of coding potential. >>>> longest_orfs.gff3 : positions of all ORFs as found in the target >>>> transcripts >>>> longest_orfs.cds : the nucleotide coding sequence for all detected >>>> ORFs >>>> >>>> no other files are generated, e.g.: >>>> >>>> """longest_orfs.cds.top_500_longest""" >>>> >>>> >>>> or the final outputs files in the current working directory >>>> >>>> >>>> e.g. """transcripts.fasta.transdecoder.pep""" >>>> >>>> >>>> Is the documentation (http://transdecoder.sourceforge.net/) out of >>>> date, or is the transdecoder failing silently, I couldnt see any issues >>>> with memory usage etc. >>>> >>>> Thanks >>>> >>>> >>>> Jon >>>> >>> >>> >>> >>> -- >>> -- >>> Brian J. Haas >>> The Broad Institute >>> http://broad.mit.edu/~bhaas >>> >>> >>> >> >> > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > |
From: Brian H. <bh...@br...> - 2014-08-19 11:48:44
|
OK - I look forward to learning more. thx, ~brian On Tue, Aug 19, 2014 at 7:45 AM, Jon Lees <jon...@gm...> wrote: > Hi Brian > Yes I just tried that, and it looks like the documentation is up to date, > but its silently crashing at some point. > > Ive just started trying to debug to find out at which point its stops > running > will let you know if I find anything > > thanks and > best wishes > > Jon > > > > > On Tue, Aug 19, 2014 at 12:35 PM, Brian Haas <bh...@br...> > wrote: > >> HI Jon, >> >> The documentation could be out of date. Can you try running the sample >> data set through and see if it generates the expected output files? >> >> cd sample_data/ >> ./runMe.sh >> >> best, >> >> ~brian >> >> >> >> >> On Tue, Aug 19, 2014 at 6:27 AM, Jon Lees <jon...@gm...> wrote: >> >>> Hi >>> >>> Ive run transdecoder a couple of times now, >>> >>> However it only generates the temporary folder >>> >>> with the three files: >>> longest_orfs.pep : all ORFs meeting the minimum length criteria, >>> regardless of coding potential. >>> longest_orfs.gff3 : positions of all ORFs as found in the target >>> transcripts >>> longest_orfs.cds : the nucleotide coding sequence for all detected ORFs >>> >>> no other files are generated, e.g.: >>> >>> """longest_orfs.cds.top_500_longest""" >>> >>> >>> or the final outputs files in the current working directory >>> >>> >>> e.g. """transcripts.fasta.transdecoder.pep""" >>> >>> >>> Is the documentation (http://transdecoder.sourceforge.net/) out of >>> date, or is the transdecoder failing silently, I couldnt see any issues >>> with memory usage etc. >>> >>> Thanks >>> >>> >>> Jon >>> >> >> >> >> -- >> -- >> Brian J. Haas >> The Broad Institute >> http://broad.mit.edu/~bhaas >> >> >> > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Jon L. <jon...@gm...> - 2014-08-19 11:45:49
|
Hi Brian Yes I just tried that, and it looks like the documentation is up to date, but its silently crashing at some point. Ive just started trying to debug to find out at which point its stops running will let you know if I find anything thanks and best wishes Jon On Tue, Aug 19, 2014 at 12:35 PM, Brian Haas <bh...@br...> wrote: > HI Jon, > > The documentation could be out of date. Can you try running the sample > data set through and see if it generates the expected output files? > > cd sample_data/ > ./runMe.sh > > best, > > ~brian > > > > > On Tue, Aug 19, 2014 at 6:27 AM, Jon Lees <jon...@gm...> wrote: > >> Hi >> >> Ive run transdecoder a couple of times now, >> >> However it only generates the temporary folder >> >> with the three files: >> longest_orfs.pep : all ORFs meeting the minimum length criteria, >> regardless of coding potential. >> longest_orfs.gff3 : positions of all ORFs as found in the target >> transcripts >> longest_orfs.cds : the nucleotide coding sequence for all detected ORFs >> >> no other files are generated, e.g.: >> >> """longest_orfs.cds.top_500_longest""" >> >> >> or the final outputs files in the current working directory >> >> >> e.g. """transcripts.fasta.transdecoder.pep""" >> >> >> Is the documentation (http://transdecoder.sourceforge.net/) out of date, >> or is the transdecoder failing silently, I couldnt see any issues with >> memory usage etc. >> >> Thanks >> >> >> Jon >> > > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > |
From: Brian H. <bh...@br...> - 2014-08-19 11:35:42
|
HI Jon, The documentation could be out of date. Can you try running the sample data set through and see if it generates the expected output files? cd sample_data/ ./runMe.sh best, ~brian On Tue, Aug 19, 2014 at 6:27 AM, Jon Lees <jon...@gm...> wrote: > Hi > > Ive run transdecoder a couple of times now, > > However it only generates the temporary folder > > with the three files: > longest_orfs.pep : all ORFs meeting the minimum length criteria, > regardless of coding potential. > longest_orfs.gff3 : positions of all ORFs as found in the target > transcripts > longest_orfs.cds : the nucleotide coding sequence for all detected ORFs > > no other files are generated, e.g.: > > """longest_orfs.cds.top_500_longest""" > > > or the final outputs files in the current working directory > > > e.g. """transcripts.fasta.transdecoder.pep""" > > > Is the documentation (http://transdecoder.sourceforge.net/) out of date, > or is the transdecoder failing silently, I couldnt see any issues with > memory usage etc. > > Thanks > > > Jon > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Jon L. <jon...@gm...> - 2014-08-19 10:27:54
|
Hi Ive run transdecoder a couple of times now, However it only generates the temporary folder with the three files: longest_orfs.pep : all ORFs meeting the minimum length criteria, regardless of coding potential. longest_orfs.gff3 : positions of all ORFs as found in the target transcripts longest_orfs.cds : the nucleotide coding sequence for all detected ORFs no other files are generated, e.g.: """longest_orfs.cds.top_500_longest""" or the final outputs files in the current working directory e.g. """transcripts.fasta.transdecoder.pep""" Is the documentation (http://transdecoder.sourceforge.net/) out of date, or is the transdecoder failing silently, I couldnt see any issues with memory usage etc. Thanks Jon |
From: Brian H. <bh...@br...> - 2014-08-18 14:14:24
|
Hi Dr. Jon, The 'complete' designation indicates that both a start and stop codon are found. If it's 3' partial, then it's missing the 3' end (lacking a stop codon). There might be other competing methods, but I'm not sure what they are. Others might comment here. best, ~brian On Mon, Aug 18, 2014 at 9:48 AM, Jon Lees <jon...@gm...> wrote: > Dear Transdecoder Team > > I have started using transdecoder, and have an output peptide files. > > I was interested in understanding the header output > > for example what does type:3prime_partial compared to type:complete > > >CG43321-RA.A|m.2 CG43321-RA.A|g.2 type:3prime_partial len:122 gc:universal CG43321-RA.A:363-1(-) > MHKFAPNSFGSPCRSSICAMVVILIIFRDIGRIKIKYFEELGQFLAHKSAQYNAGYNCDENSDVLVFIVLLSNFIYFFKVLEVVSTKWHFCVFISVTDFSVCSEMEYDIILIEISRLSPAK > >CG43321.a|m.3 CG43321.a|g.3 type:3prime_partial len:103 gc:universal CG43321.a:306-1(-) > MVVILIIFRDIGRIKIKYFEELGQFLAHKSAQYNAGYNCDENSDVLVFIVLLSNFIYFFKVLEVVSTKWHFCVFISVTDFSVCSEMEYDIILIEISRLSPAK > >RluA-1.d|m.4 RluA-1.d|g.4 type:complete len:294 gc:universal RluA-1.d:292-1173(+) > MEQQIRNRQVEKEYICRVEGVFPDGIVECKEPIEVVSYKIGVCKVSAKGKDCTTTFQKLSQNGTTSVVLCKPLTGRMHQIRVHLQYLGYPILNDPLYNHEVFGPLKGRSGDIGGKSDEELIRDLINIHNAENWLGIDCDSDISMFKSTKDEADRESLSSEHT > SVVHHSDDDGCVNSRETTPPCNEPQQPENSVKLLETTNAVKEYQVAAQKSSSEICPAPLDAVESPLNGGGCNVGKVTVDEHCYECKVHYRDPKSKDLIMYLHAWKYKGPGWEYETELPNWARNDWDHLDSA* > > Are there any other methods that are competitive with transdecoder in terms of performance > > Thanks > > Dr Jon Lees > > > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Transdecoder-users mailing list > Tra...@li... > https://lists.sourceforge.net/lists/listinfo/transdecoder-users > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Jon L. <jon...@gm...> - 2014-08-18 13:48:16
|
Dear Transdecoder Team I have started using transdecoder, and have an output peptide files. I was interested in understanding the header output for example what does type:3prime_partial compared to type:complete >CG43321-RA.A|m.2 CG43321-RA.A|g.2 type:3prime_partial len:122 gc:universal CG43321-RA.A:363-1(-) MHKFAPNSFGSPCRSSICAMVVILIIFRDIGRIKIKYFEELGQFLAHKSAQYNAGYNCDENSDVLVFIVLLSNFIYFFKVLEVVSTKWHFCVFISVTDFSVCSEMEYDIILIEISRLSPAK >CG43321.a|m.3 CG43321.a|g.3 type:3prime_partial len:103 gc:universal CG43321.a:306-1(-) MVVILIIFRDIGRIKIKYFEELGQFLAHKSAQYNAGYNCDENSDVLVFIVLLSNFIYFFKVLEVVSTKWHFCVFISVTDFSVCSEMEYDIILIEISRLSPAK >RluA-1.d|m.4 RluA-1.d|g.4 type:complete len:294 gc:universal RluA-1.d:292-1173(+) MEQQIRNRQVEKEYICRVEGVFPDGIVECKEPIEVVSYKIGVCKVSAKGKDCTTTFQKLSQNGTTSVVLCKPLTGRMHQIRVHLQYLGYPILNDPLYNHEVFGPLKGRSGDIGGKSDEELIRDLINIHNAENWLGIDCDSDISMFKSTKDEADRESLSSEHT SVVHHSDDDGCVNSRETTPPCNEPQQPENSVKLLETTNAVKEYQVAAQKSSSEICPAPLDAVESPLNGGGCNVGKVTVDEHCYECKVHYRDPKSKDLIMYLHAWKYKGPGWEYETELPNWARNDWDHLDSA* Are there any other methods that are competitive with transdecoder in terms of performance Thanks Dr Jon Lees |
From: Brian H. <bh...@br...> - 2014-07-29 13:13:16
|
If you use the -S parameter, then only the top strand will be examined. If you know ahead of time that the transcript sequence is reverse-complemented, then you should reverse-complement the sequence before providing it to transdecoder with the -S option. best, ~brian On Tue, Jul 29, 2014 at 2:33 AM, 130...@pk... < 130...@pk...> wrote: > So, if a transcript is labeled "-" and we know the sense strand is > negative,then Transdecoder will find the CDS in "-" strand or "+" strand > with the option "-S" ? > > > ------------------------------ > 130...@pk... > > *From:* Brian Haas <bh...@br...> > *Date:* 2014-07-24 19:50 > *To:* 130...@pk... > *CC:* transdecoder-users <tra...@li...> > *Subject:* Re: [Transdecoder-users] question about Transdecoder > Hi, > > The top strand is the sense strand in the case that you have all the input > transcripts already oriented in their transcribed orientation (ie. from > using strand-specific rna-seq assembly). > > We don't have utilities included to select the single 'best' ORF per > transcript. All likely coding regions are reported. > > best, > > ~brian > > > On Wed, Jul 23, 2014 at 11:16 PM, 130...@pk... < > 130...@pk...> wrote: > >> Hi, I am glad to write to you! >> I want to use Transdecoder to find out the ORF. But I have two question: >> 1. what does the "top strand" mean? "If the transcripts are oriented >> according to the sense strand, then include the -S flag to examine only the >> top strand. " >> 2. If I only want to find out the longest ORF in the DNA sequense, what >> should I do ? >> >> Thanks! >> >> >> ------------------------------ >> 130...@pk... >> >> >> ------------------------------------------------------------------------------ >> Want fast and easy access to all the code in your enterprise? Index and >> search up to 200,000 lines of code with a free copy of Black Duck >> Code Sight - the same software that powers the world's largest code >> search on Ohloh, the Black Duck Open Hub! Try it now. >> http://p.sf.net/sfu/bds >> _______________________________________________ >> Transdecoder-users mailing list >> Tra...@li... >> https://lists.sourceforge.net/lists/listinfo/transdecoder-users >> >> > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Trakhtenberg, F. <Eph...@ch...> - 2014-07-27 18:25:44
|
Hi Brian, Thank you again for the feedback. If Trinity can reconstruct transcripts with ployA, I may need to learn how to use it. On Wed I posted a question on Tophat/Cufflinks board regarding this issue, but so far no responses: https://groups.google.com/forum/#!topic/tuxedo-tools-users/Ntv98xmHT7c. The PASA sounds like an excellent resource for this as well, thank you. Bets regards, Ephraim ________________________________ From: Brian Haas [bh...@br...] Sent: Tuesday, July 22, 2014 9:20 PM To: Trakhtenberg, Feliks Cc: tra...@li...; rc...@li... Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? It's definitely an interesting research question. We've noticed anecdotally that many of our Trinity-reconstructed transcripts appear to have a stretch of polyA at their 3' end (or in non-strand-specific RNA-Seq assembly, a poly T at 5' end corresponding to the rev-comp). It would be useful to know how many polyA's we can detect, from the reads and from our rna-seq assemblies. Note, we have a polyA-detection system that we used in PASA (http://pasa.sf.net), and published a couple of papers on several years ago, based on EST and cDNA data, but I haven't revisited it with more modern data types. It's certainly an area that deserves a lot more attention. best, ~b On Tue, Jul 22, 2014 at 9:01 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: This may actually be interesting to investigate: what percent of the transcripts with TranscriptDecoder predicted ORF have polyA site predicted in appropriate location, and how this correlates with actual polyA addition. We have polyA-selected RNAseq and total RNAseq (after ribo depletion), which we could use for addressing this question. Still trying to figure out if we could also detect actual polyA sequences on 3' RNA fragments in the raw reads. I am also interested in polyA-lncRNAs, which have no ORF but are exported to the cytoplasm because of polyA. If these questions might be of interest to you too, or if this has been done already, please let us know. thank you, Ephraim ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Tuesday, July 22, 2014 1:17 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...>; rc...@li...<mailto:rc...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Thanks for the info! I've been thinking about this some more, and I'm not sure how useful it will be in the context of transdecoder after all... This is because, although these tools can be quite useful for identifying or predicting sites of polyadenylation, not finding such a site might not be a robust indicator that you don't have a coding transcript, particularly if you find a nice long ORF with good coding potential. For now, I think we'll rely on coding metrics and homology data, and eventually we'll leverage a lincRNA and other ncRNA classification tools to facilitate transcriptome annotations. best, ~brian On Tue, Jul 22, 2014 at 12:30 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hi Brian, CC'ed: Lingsheng Dong from our bioinformatics support team. The 3 links below seem to offer free software for mouse polyA prediction using transcripts sequences input, which could be taken directly from Cufflinks de novo transcripts output. I wonder if one of them could be a good fit for using along with the TranscriptDecoder. For example, the polyA site that occurs on 3' past the predicted stop codon would be valid, but if there is no polyA site, or more than 10 bp before the stop codon, such transcript would probably not produce a protein even if ORF is predicted. The 3' ends are cleaved off about 10 bp downstream of the polyA site prior to polyA addition, that is why I thought polyA site could not be more than 10 bp upstream of the stop codon. Please let me know what you think about this. http://exon.umdnj.edu/polya_svm_server/index.html http://mlkd.csd.auth.gr/PolyA/datasets.html http://genes.mit.edu/GENSCAN.html Thank you, Ephraim P.S. Links below seem to be either only for Human or predict polyA site directly from genome DNA rather than from spliced transcripts sequences: http://grail.lsd.ornl.gov/grailexp/ http://linux1.softberry.com/berry.phtml?topic=products http://www.imtech.res.in/raghava/polyapred/help.html http://cbrc.kaust.edu.sa/dps/ http://dnafsminer.bic.nus.edu.sg/PolyA.html http://cub.comsats.edu.pk/polyapredict.htm ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Saturday, July 19, 2014 6:33 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? I actually haven't researched that in a while. Please let me know if you find something really useful. best, ~brian On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Thank you for the clarification. Any recommendations on which tool I could use for predicting polyA? ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Saturday, July 19, 2014 5:53 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Hi, TransDecoder does not take into account whether a transcript is polyadenylated or not. It strictly looks at open reading frames and coding potential. best, ~brian On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hello, I am new to TranscriptDecoder. Does it consider whether a novel transcript with a predicted ORF is also predicted to be polyadenylated? I assumed if no polyA can be predicted, then even if ORF is predicted, such transcript may not produce a protein. My input is a group of novel transcripts predicted by Cufflinks analysis. Would appreciate feedback. thank you ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Transdecoder-users mailing list Tra...@li...<mailto:Tra...@li...> https://lists.sourceforge.net/lists/listinfo/transdecoder-users -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Brian H. <bh...@br...> - 2014-07-24 11:50:52
|
Hi, The top strand is the sense strand in the case that you have all the input transcripts already oriented in their transcribed orientation (ie. from using strand-specific rna-seq assembly). We don't have utilities included to select the single 'best' ORF per transcript. All likely coding regions are reported. best, ~brian On Wed, Jul 23, 2014 at 11:16 PM, 130...@pk... < 130...@pk...> wrote: > Hi, I am glad to write to you! > I want to use Transdecoder to find out the ORF. But I have two question: > 1. what does the "top strand" mean? "If the transcripts are oriented > according to the sense strand, then include the -S flag to examine only the > top strand. " > 2. If I only want to find out the longest ORF in the DNA sequense, what > should I do ? > > Thanks! > > > ------------------------------ > 130...@pk... > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > Transdecoder-users mailing list > Tra...@li... > https://lists.sourceforge.net/lists/listinfo/transdecoder-users > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: <130...@pk...> - 2014-07-24 03:32:44
|
Hi, I am glad to write to you! I want to use Transdecoder to find out the ORF. But I have two question: 1. what does the "top strand" mean? "If the transcripts are oriented according to the sense strand, then include the -S flag to examine only the top strand. " 2. If I only want to find out the longest ORF in the DNA sequense, what should I do ? Thanks! 130...@pk... |
From: Brian H. <bh...@br...> - 2014-07-23 01:20:33
|
It's definitely an interesting research question. We've noticed anecdotally that many of our Trinity-reconstructed transcripts appear to have a stretch of polyA at their 3' end (or in non-strand-specific RNA-Seq assembly, a poly T at 5' end corresponding to the rev-comp). It would be useful to know how many polyA's we can detect, from the reads and from our rna-seq assemblies. Note, we have a polyA-detection system that we used in PASA ( http://pasa.sf.net), and published a couple of papers on several years ago, based on EST and cDNA data, but I haven't revisited it with more modern data types. It's certainly an area that deserves a lot more attention. best, ~b On Tue, Jul 22, 2014 at 9:01 PM, Trakhtenberg, Feliks < Eph...@ch...> wrote: > This may actually be interesting to investigate: what percent of the > transcripts with TranscriptDecoder predicted ORF have polyA site > predicted in appropriate location, and how this correlates with actual > polyA addition. We have polyA-selected RNAseq and total RNAseq (after ribo > depletion), which we could use for addressing this question. Still trying > to figure out if we could also detect actual polyA sequences on 3' RNA > fragments in the raw reads. I am also interested in polyA-lncRNAs, which > have no ORF but are exported to the cytoplasm because of polyA. If these > questions might be of interest to you too, or if this has been done > already, please let us know. > > > thank you, > Ephraim > > > ------------------------------ > *From:* Brian Haas [bh...@br...] > *Sent:* Tuesday, July 22, 2014 1:17 PM > *To:* Trakhtenberg, Feliks > *Cc:* tra...@li...; rc...@li... > > *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict > polyA? > > Thanks for the info! > > I've been thinking about this some more, and I'm not sure how useful it > will be in the context of transdecoder after all... This is because, > although these tools can be quite useful for identifying or predicting > sites of polyadenylation, not finding such a site might not be a robust > indicator that you don't have a coding transcript, particularly if you find > a nice long ORF with good coding potential. For now, I think we'll rely on > coding metrics and homology data, and eventually we'll leverage a lincRNA > and other ncRNA classification tools to facilitate transcriptome > annotations. > > best, > > ~brian > > > > On Tue, Jul 22, 2014 at 12:30 PM, Trakhtenberg, Feliks < > Eph...@ch...> wrote: > >> Hi Brian, >> >> >> >> CC'ed: Lingsheng Dong from our bioinformatics support team. >> >> >> >> The 3 links below seem to offer free software for mouse polyA prediction >> using transcripts sequences input, which could be taken directly from >> Cufflinks de novo transcripts output. I wonder if one of them could be a >> good fit for using along with the TranscriptDecoder. For example, the polyA >> site that occurs on 3' past the predicted stop codon would be valid, but if >> there is no polyA site, or more than 10 bp before the stop codon, such >> transcript would probably not produce a protein even if ORF is predicted. >> The 3' ends are cleaved off about 10 bp downstream of the polyA site prior >> to polyA addition, that is why I thought polyA site could not be more than >> 10 bp upstream of the stop codon. Please let me know what you think about >> this. >> >> >> >> http://exon.umdnj.edu/polya_svm_server/index.html >> http://mlkd.csd.auth.gr/PolyA/datasets.html >> http://genes.mit.edu/GENSCAN.html >> >> Thank you, >> >> Ephraim >> >> >> >> P.S. Links below seem to be either only for Human or predict polyA site >> directly from genome DNA rather than from spliced transcripts sequences: >> http://grail.lsd.ornl.gov/grailexp/ >> http://linux1.softberry.com/berry.phtml?topic=products >> http://www.imtech.res.in/raghava/polyapred/help.html >> http://cbrc.kaust.edu.sa/dps/ >> http://dnafsminer.bic.nus.edu.sg/PolyA.html >> http://cub.comsats.edu.pk/polyapredict.htm >> >> >> ------------------------------ >> *From:* Brian Haas [bh...@br...] >> *Sent:* Saturday, July 19, 2014 6:33 PM >> >> *To:* Trakhtenberg, Feliks >> *Cc:* tra...@li... >> *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict >> polyA? >> >> I actually haven't researched that in a while. Please let me know if >> you find something really useful. >> >> best, >> >> ~brian >> >> >> >> On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks < >> Eph...@ch...> wrote: >> >>> Thank you for the clarification. Any recommendations on which tool I >>> could use for predicting polyA? >>> >>> >>> ------------------------------ >>> *From:* Brian Haas [bh...@br...] >>> *Sent:* Saturday, July 19, 2014 5:53 PM >>> *To:* Trakhtenberg, Feliks >>> *Cc:* tra...@li... >>> *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict >>> polyA? >>> >>> Hi, >>> >>> TransDecoder does not take into account whether a transcript is >>> polyadenylated or not. It strictly looks at open reading frames and coding >>> potential. >>> >>> best, >>> >>> ~brian >>> >>> >>> >>> On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks < >>> Eph...@ch...> wrote: >>> >>>> Hello, I am new to TranscriptDecoder. Does it consider whether a >>>> novel transcript with a predicted ORF is also predicted to be >>>> polyadenylated? I assumed if no polyA can be predicted, then even if >>>> ORF is predicted, such transcript may not produce a protein. My input >>>> is a group of novel transcripts predicted by Cufflinks analysis. Would >>>> appreciate feedback. thank you >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Want fast and easy access to all the code in your enterprise? Index and >>>> search up to 200,000 lines of code with a free copy of Black Duck >>>> Code Sight - the same software that powers the world's largest code >>>> search on Ohloh, the Black Duck Open Hub! Try it now. >>>> http://p.sf.net/sfu/bds >>>> _______________________________________________ >>>> Transdecoder-users mailing list >>>> Tra...@li... >>>> https://lists.sourceforge.net/lists/listinfo/transdecoder-users >>>> >>>> >>> >>> >>> -- >>> -- >>> Brian J. Haas >>> The Broad Institute >>> http://broad.mit.edu/~bhaas >>> >>> >>> >> >> >> >> -- >> -- >> Brian J. Haas >> The Broad Institute >> http://broad.mit.edu/~bhaas >> >> >> > > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Trakhtenberg, F. <Eph...@ch...> - 2014-07-23 01:02:12
|
This may actually be interesting to investigate: what percent of the transcripts with TranscriptDecoder predicted ORF have polyA site predicted in appropriate location, and how this correlates with actual polyA addition. We have polyA-selected RNAseq and total RNAseq (after ribo depletion), which we could use for addressing this question. Still trying to figure out if we could also detect actual polyA sequences on 3' RNA fragments in the raw reads. I am also interested in polyA-lncRNAs, which have no ORF but are exported to the cytoplasm because of polyA. If these questions might be of interest to you too, or if this has been done already, please let us know. thank you, Ephraim ________________________________ From: Brian Haas [bh...@br...] Sent: Tuesday, July 22, 2014 1:17 PM To: Trakhtenberg, Feliks Cc: tra...@li...; rc...@li... Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Thanks for the info! I've been thinking about this some more, and I'm not sure how useful it will be in the context of transdecoder after all... This is because, although these tools can be quite useful for identifying or predicting sites of polyadenylation, not finding such a site might not be a robust indicator that you don't have a coding transcript, particularly if you find a nice long ORF with good coding potential. For now, I think we'll rely on coding metrics and homology data, and eventually we'll leverage a lincRNA and other ncRNA classification tools to facilitate transcriptome annotations. best, ~brian On Tue, Jul 22, 2014 at 12:30 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hi Brian, CC'ed: Lingsheng Dong from our bioinformatics support team. The 3 links below seem to offer free software for mouse polyA prediction using transcripts sequences input, which could be taken directly from Cufflinks de novo transcripts output. I wonder if one of them could be a good fit for using along with the TranscriptDecoder. For example, the polyA site that occurs on 3' past the predicted stop codon would be valid, but if there is no polyA site, or more than 10 bp before the stop codon, such transcript would probably not produce a protein even if ORF is predicted. The 3' ends are cleaved off about 10 bp downstream of the polyA site prior to polyA addition, that is why I thought polyA site could not be more than 10 bp upstream of the stop codon. Please let me know what you think about this. http://exon.umdnj.edu/polya_svm_server/index.html http://mlkd.csd.auth.gr/PolyA/datasets.html http://genes.mit.edu/GENSCAN.html Thank you, Ephraim P.S. Links below seem to be either only for Human or predict polyA site directly from genome DNA rather than from spliced transcripts sequences: http://grail.lsd.ornl.gov/grailexp/ http://linux1.softberry.com/berry.phtml?topic=products http://www.imtech.res.in/raghava/polyapred/help.html http://cbrc.kaust.edu.sa/dps/ http://dnafsminer.bic.nus.edu.sg/PolyA.html http://cub.comsats.edu.pk/polyapredict.htm ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Saturday, July 19, 2014 6:33 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? I actually haven't researched that in a while. Please let me know if you find something really useful. best, ~brian On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Thank you for the clarification. Any recommendations on which tool I could use for predicting polyA? ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Saturday, July 19, 2014 5:53 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Hi, TransDecoder does not take into account whether a transcript is polyadenylated or not. It strictly looks at open reading frames and coding potential. best, ~brian On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hello, I am new to TranscriptDecoder. Does it consider whether a novel transcript with a predicted ORF is also predicted to be polyadenylated? I assumed if no polyA can be predicted, then even if ORF is predicted, such transcript may not produce a protein. My input is a group of novel transcripts predicted by Cufflinks analysis. Would appreciate feedback. thank you ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Transdecoder-users mailing list Tra...@li...<mailto:Tra...@li...> https://lists.sourceforge.net/lists/listinfo/transdecoder-users -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Trakhtenberg, F. <Eph...@ch...> - 2014-07-22 17:30:16
|
Hi Brian, CC'ed: Lingsheng Dong from our bioinformatics support team. The 3 links below seem to offer free software for mouse polyA prediction using transcripts sequences input, which could be taken directly from Cufflinks de novo transcripts output. I wonder if one of them could be a good fit for using along with the TranscriptDecoder. For example, the polyA site that occurs on 3' past the predicted stop codon would be valid, but if there is no polyA site, or more than 10 bp before the stop codon, such transcript would probably not produce a protein even if ORF is predicted. The 3' ends are cleaved off about 10 bp downstream of the polyA site prior to polyA addition, that is why I thought polyA site could not be more than 10 bp upstream of the stop codon. Please let me know what you think about this. http://exon.umdnj.edu/polya_svm_server/index.html http://mlkd.csd.auth.gr/PolyA/datasets.html http://genes.mit.edu/GENSCAN.html Thank you, Ephraim P.S. Links below seem to be either only for Human or predict polyA site directly from genome DNA rather than from spliced transcripts sequences: http://grail.lsd.ornl.gov/grailexp/ http://linux1.softberry.com/berry.phtml?topic=products http://www.imtech.res.in/raghava/polyapred/help.html http://cbrc.kaust.edu.sa/dps/ http://dnafsminer.bic.nus.edu.sg/PolyA.html http://cub.comsats.edu.pk/polyapredict.htm ________________________________ From: Brian Haas [bh...@br...] Sent: Saturday, July 19, 2014 6:33 PM To: Trakhtenberg, Feliks Cc: tra...@li... Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? I actually haven't researched that in a while. Please let me know if you find something really useful. best, ~brian On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Thank you for the clarification. Any recommendations on which tool I could use for predicting polyA? ________________________________ From: Brian Haas [bh...@br...<mailto:bh...@br...>] Sent: Saturday, July 19, 2014 5:53 PM To: Trakhtenberg, Feliks Cc: tra...@li...<mailto:tra...@li...> Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Hi, TransDecoder does not take into account whether a transcript is polyadenylated or not. It strictly looks at open reading frames and coding potential. best, ~brian On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hello, I am new to TranscriptDecoder. Does it consider whether a novel transcript with a predicted ORF is also predicted to be polyadenylated? I assumed if no polyA can be predicted, then even if ORF is predicted, such transcript may not produce a protein. My input is a group of novel transcripts predicted by Cufflinks analysis. Would appreciate feedback. thank you ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Transdecoder-users mailing list Tra...@li...<mailto:Tra...@li...> https://lists.sourceforge.net/lists/listinfo/transdecoder-users -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Brian H. <bh...@br...> - 2014-07-22 17:17:25
|
Thanks for the info! I've been thinking about this some more, and I'm not sure how useful it will be in the context of transdecoder after all... This is because, although these tools can be quite useful for identifying or predicting sites of polyadenylation, not finding such a site might not be a robust indicator that you don't have a coding transcript, particularly if you find a nice long ORF with good coding potential. For now, I think we'll rely on coding metrics and homology data, and eventually we'll leverage a lincRNA and other ncRNA classification tools to facilitate transcriptome annotations. best, ~brian On Tue, Jul 22, 2014 at 12:30 PM, Trakhtenberg, Feliks < Eph...@ch...> wrote: > Hi Brian, > > > > CC'ed: Lingsheng Dong from our bioinformatics support team. > > > > The 3 links below seem to offer free software for mouse polyA prediction > using transcripts sequences input, which could be taken directly from > Cufflinks de novo transcripts output. I wonder if one of them could be a > good fit for using along with the TranscriptDecoder. For example, the polyA > site that occurs on 3' past the predicted stop codon would be valid, but if > there is no polyA site, or more than 10 bp before the stop codon, such > transcript would probably not produce a protein even if ORF is predicted. > The 3' ends are cleaved off about 10 bp downstream of the polyA site prior > to polyA addition, that is why I thought polyA site could not be more than > 10 bp upstream of the stop codon. Please let me know what you think about > this. > > > > http://exon.umdnj.edu/polya_svm_server/index.html > http://mlkd.csd.auth.gr/PolyA/datasets.html > http://genes.mit.edu/GENSCAN.html > > Thank you, > > Ephraim > > > > P.S. Links below seem to be either only for Human or predict polyA site > directly from genome DNA rather than from spliced transcripts sequences: > http://grail.lsd.ornl.gov/grailexp/ > http://linux1.softberry.com/berry.phtml?topic=products > http://www.imtech.res.in/raghava/polyapred/help.html > http://cbrc.kaust.edu.sa/dps/ > http://dnafsminer.bic.nus.edu.sg/PolyA.html > http://cub.comsats.edu.pk/polyapredict.htm > > > ------------------------------ > *From:* Brian Haas [bh...@br...] > *Sent:* Saturday, July 19, 2014 6:33 PM > > *To:* Trakhtenberg, Feliks > *Cc:* tra...@li... > *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict > polyA? > > I actually haven't researched that in a while. Please let me know if > you find something really useful. > > best, > > ~brian > > > > On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks < > Eph...@ch...> wrote: > >> Thank you for the clarification. Any recommendations on which tool I >> could use for predicting polyA? >> >> >> ------------------------------ >> *From:* Brian Haas [bh...@br...] >> *Sent:* Saturday, July 19, 2014 5:53 PM >> *To:* Trakhtenberg, Feliks >> *Cc:* tra...@li... >> *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict >> polyA? >> >> Hi, >> >> TransDecoder does not take into account whether a transcript is >> polyadenylated or not. It strictly looks at open reading frames and coding >> potential. >> >> best, >> >> ~brian >> >> >> >> On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks < >> Eph...@ch...> wrote: >> >>> Hello, I am new to TranscriptDecoder. Does it consider whether a novel >>> transcript with a predicted ORF is also predicted to be polyadenylated? >>> I assumed if no polyA can be predicted, then even if ORF is predicted, >>> such transcript may not produce a protein. My input is a group of novel >>> transcripts predicted by Cufflinks analysis. Would appreciate feedback. >>> thank you >>> >>> >>> ------------------------------------------------------------------------------ >>> Want fast and easy access to all the code in your enterprise? Index and >>> search up to 200,000 lines of code with a free copy of Black Duck >>> Code Sight - the same software that powers the world's largest code >>> search on Ohloh, the Black Duck Open Hub! Try it now. >>> http://p.sf.net/sfu/bds >>> _______________________________________________ >>> Transdecoder-users mailing list >>> Tra...@li... >>> https://lists.sourceforge.net/lists/listinfo/transdecoder-users >>> >>> >> >> >> -- >> -- >> Brian J. Haas >> The Broad Institute >> http://broad.mit.edu/~bhaas >> >> >> > > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Brian H. <bh...@br...> - 2014-07-19 22:33:28
|
I actually haven't researched that in a while. Please let me know if you find something really useful. best, ~brian On Sat, Jul 19, 2014 at 6:00 PM, Trakhtenberg, Feliks < Eph...@ch...> wrote: > Thank you for the clarification. Any recommendations on which tool I > could use for predicting polyA? > > > ------------------------------ > *From:* Brian Haas [bh...@br...] > *Sent:* Saturday, July 19, 2014 5:53 PM > *To:* Trakhtenberg, Feliks > *Cc:* tra...@li... > *Subject:* Re: [Transdecoder-users] Does TranscriptDecoder also predict > polyA? > > Hi, > > TransDecoder does not take into account whether a transcript is > polyadenylated or not. It strictly looks at open reading frames and coding > potential. > > best, > > ~brian > > > > On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks < > Eph...@ch...> wrote: > >> Hello, I am new to TranscriptDecoder. Does it consider whether a novel >> transcript with a predicted ORF is also predicted to be polyadenylated? >> I assumed if no polyA can be predicted, then even if ORF is predicted, >> such transcript may not produce a protein. My input is a group of novel >> transcripts predicted by Cufflinks analysis. Would appreciate feedback. >> thank you >> >> >> ------------------------------------------------------------------------------ >> Want fast and easy access to all the code in your enterprise? Index and >> search up to 200,000 lines of code with a free copy of Black Duck >> Code Sight - the same software that powers the world's largest code >> search on Ohloh, the Black Duck Open Hub! Try it now. >> http://p.sf.net/sfu/bds >> _______________________________________________ >> Transdecoder-users mailing list >> Tra...@li... >> https://lists.sourceforge.net/lists/listinfo/transdecoder-users >> >> > > > -- > -- > Brian J. Haas > The Broad Institute > http://broad.mit.edu/~bhaas > > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Trakhtenberg, F. <Eph...@ch...> - 2014-07-19 22:00:14
|
Thank you for the clarification. Any recommendations on which tool I could use for predicting polyA? ________________________________ From: Brian Haas [bh...@br...] Sent: Saturday, July 19, 2014 5:53 PM To: Trakhtenberg, Feliks Cc: tra...@li... Subject: Re: [Transdecoder-users] Does TranscriptDecoder also predict polyA? Hi, TransDecoder does not take into account whether a transcript is polyadenylated or not. It strictly looks at open reading frames and coding potential. best, ~brian On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks <Eph...@ch...<mailto:Eph...@ch...>> wrote: Hello, I am new to TranscriptDecoder. Does it consider whether a novel transcript with a predicted ORF is also predicted to be polyadenylated? I assumed if no polyA can be predicted, then even if ORF is predicted, such transcript may not produce a protein. My input is a group of novel transcripts predicted by Cufflinks analysis. Would appreciate feedback. thank you ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Transdecoder-users mailing list Tra...@li...<mailto:Tra...@li...> https://lists.sourceforge.net/lists/listinfo/transdecoder-users -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Brian H. <bh...@br...> - 2014-07-19 21:54:04
|
Hi, TransDecoder does not take into account whether a transcript is polyadenylated or not. It strictly looks at open reading frames and coding potential. best, ~brian On Sat, Jul 19, 2014 at 2:38 PM, Trakhtenberg, Feliks < Eph...@ch...> wrote: > Hello, I am new to TranscriptDecoder. Does it consider whether a novel > transcript with a predicted ORF is also predicted to be polyadenylated? I > assumed if no polyA can be predicted, then even if ORF is predicted, such > transcript may not produce a protein. My input is a group of novel > transcripts predicted by Cufflinks analysis. Would appreciate feedback. > thank you > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > Transdecoder-users mailing list > Tra...@li... > https://lists.sourceforge.net/lists/listinfo/transdecoder-users > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Trakhtenberg, F. <Eph...@ch...> - 2014-07-19 19:00:16
|
Hello, I am new to TranscriptDecoder. Does it consider whether a novel transcript with a predicted ORF is also predicted to be polyadenylated? I assumed if no polyA can be predicted, then even if ORF is predicted, such transcript may not produce a protein. My input is a group of novel transcripts predicted by Cufflinks analysis. Would appreciate feedback. thank you |
From: Brian H. <bh...@br...> - 2014-07-16 15:18:08
|
Hi Adam, We don't have that functionality currently built in to transdecoder. With bioperl, biopython, or other, you could probably script something out to do it fairly easily, though. We'll consider adding such functionality in a future release. best, ~brian On Wed, Jul 16, 2014 at 9:54 AM, Ciezarek, Adam <a.c...@im... > wrote: > Dear sir or madam, > > I am using transdecoder to find the longest ORF within contigs, generated > from RNA-seq data. I am only interested in analysing the longest ORF from > each contig, but Transdecoder returns multiple for many contigs. Is there > any way I can set it to just return one, or quickly remove all the shortest > sequences? > > Thanks, > Adam Ciezarek > > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > Transdecoder-users mailing list > Tra...@li... > https://lists.sourceforge.net/lists/listinfo/transdecoder-users > > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Ciezarek, A. <a.c...@im...> - 2014-07-16 14:27:12
|
Dear sir or madam, I am using transdecoder to find the longest ORF within contigs, generated from RNA-seq data. I am only interested in analysing the longest ORF from each contig, but Transdecoder returns multiple for many contigs. Is there any way I can set it to just return one, or quickly remove all the shortest sequences? Thanks, Adam Ciezarek |
From: Brian H. <bh...@br...> - 2014-07-04 14:01:51
|
Hi Martin, responses below On Fri, Jul 4, 2014 at 9:50 AM, Martin MOKREJŠ <mmo...@gm...> wrote: > > > Brian Haas wrote: > > Thanks, Martin. I've CC'd the trinity-developers list. We'll take all > your comments into consideration. It'll be some time before it all meets > your expectations (if that's even an option for us). For now, our (perhaps > my) goals have been quite simple: keep everything self-contained and > minimize dependencies. I entirely agree about that evil wget (which is one > of the reasons why I put the 'make simple' in there). This version will be > incorporated as a plugin in the upcoming Trinity, along with Jellyfish-2 > (even if currently to your dismay), and Trinity will do the 'make simple' > to avoid pulling down pfam via wget. > > Sure, take your time. Also, the packaging is an issue due to many > different LICENSEs used in all the bundled tools. Just split it all into > sub-packages, that's the only way. > > Right... that's definitely a cause for concern, which is why we keep the third-party code isolated wherever possible, and should have a note in there somewhere to follow the different licenses for the plugins. We only leverage code that has very lenient licensing, as we do for our code. > So where does trinityrnaseq_r20140413p1/trinity-plugins/jellyfish-1.1.11 > come from? It doesn't seem to be from > http://www.genome.umd.edu/jellyfish.html or > ftp://ftp.genome.umd.edu/pub/jellyfish/jellyfish-2.1.3.tar.gz > > It should have come from the jellyfish website. In the upcoming Trinity release, we're just going to bundle in the .tar.gz files for the code, and have the makefile do the unpack/build as part of the Trinity build. This will go for RSEM-2.15 as well. One of the key issues here is that there are some versions of the tools where the usage has changed significantly (both the latest rsem and jellyfish), so that our new version of Trinity will only be readily compatible with those specific versions - which is why they get bundled in, and Trinity will look in the plugins area to find what it needs. After we get the next Trinity, Trinotate, and PASA releases out, we'll rethink our bundling strategy. best, ~b > Martin > > > > > > cheers, > > > > ~brian > > > > > > On Fri, Jul 4, 2014 at 8:46 AM, Martin MOKREJŠ <mmo...@gm... > <mailto:mmo...@gm...>> wrote: > > > > Brian Haas wrote: > > > Greetings all, > > > > > > The latest release of TransDecoder is now available: > > > > > > > http://sourceforge.net/projects/transdecoder/files/TransDecoder_r20140704.tar.gz/download > > > > > > including minor changes from the previous release to ensure better > compatibility with other projects, including Trinity, PASA, and Trinotate > > > > > > Release notes: > > > > > > -added 'make simple' to build just the essential components > involving parafly and cdhit > > > > > > -removed the 'cds.' prefix from the pep and cds sequence > accessions. > > > > > > Hi Brian, > > I just tested the new and have some comments: > > > > 1. In the past the files were tar.bz2 instead of tar.gz as of now. > It helps distro maintainers if the URLs and filenames remain stable. It is > also a common habit that if one unpacks MyApp-2.4c.tar.gz that it extracts > into MyApp-2.4c/ subdirectory. Although it seems your today's archive file > complies with this I think trinity does not and not sure how long will it > last. ;) > > > > 2. It is evil that the "make compile" step runs wget to download > 1.4GB large PFAM file. Please put it under different "target" in your > Makefile's. Not only, I already have the files on my system and I certainly > do not want to waste my bandwidth. > > > > 3. I would like to add this to Gentoo Linux but that won't ever be > allowed if the package is huge glue of other tools. For example, I have > already cd-hit installed and installing TransDecoder would try to overwrite > existing files, and will be denied. If you would like to get the package > accepted into Linux distros and save developers time resolving the knotted > layout, please introduce some configure- or Makefile-based checks and bail > out if they are not installed. You can keep the crazy layout/setup as an > alternative for users who think this is the right way to go (while it is > not). > > > > 4. I wanted to post trinityrnaseq-users list about this but ... it > is confusing that trinity and transdecoders place overlapping 3rd-party > stuff under its own source tree. The > http://trinityrnaseq.sourceforge.net/#installation page it totally quiet > how all there hidden obstackles. I recommend you to sum up a simple listing > of required/optional tools, their versions and URLs. If possible, drop them > from the TransDecoder_r20140704/3rd_party and also from the plugins > subdirectory somewhere under trinity*. > > > > 5. In the current setup, both transdecoder, trinity are > un-manageable for a Linux distro. One cannot force some version > dependencies, the tools download what they want to on their own instead of > just running a compiler ... and tehy overwrite other applications files. > > > > 6. BTW, I realized quorum package looks for jellyfish-1.11 while on > the web I found only jellyfish-2.x. Incidentally, I see jellyfish-1.11 > under trinity*. Huh. Would you please tell me: whether trinity uses an > "old" jellyfish version of teh same package? Or is that that incidentally > same name? Why can't trinity use jellyfish-2.x installed already on the > system. > > > > I wish it helps you and other devs to cleanup the interesting > package, though I did not get to install it yet. > > > > Thank you, > > Martin > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Martin M. <mmo...@gm...> - 2014-07-04 13:51:23
|
Brian Haas wrote: > Thanks, Martin. I've CC'd the trinity-developers list. We'll take all your comments into consideration. It'll be some time before it all meets your expectations (if that's even an option for us). For now, our (perhaps my) goals have been quite simple: keep everything self-contained and minimize dependencies. I entirely agree about that evil wget (which is one of the reasons why I put the 'make simple' in there). This version will be incorporated as a plugin in the upcoming Trinity, along with Jellyfish-2 (even if currently to your dismay), and Trinity will do the 'make simple' to avoid pulling down pfam via wget. Sure, take your time. Also, the packaging is an issue due to many different LICENSEs used in all the bundled tools. Just split it all into sub-packages, that's the only way. So where does trinityrnaseq_r20140413p1/trinity-plugins/jellyfish-1.1.11 come from? It doesn't seem to be from http://www.genome.umd.edu/jellyfish.html or ftp://ftp.genome.umd.edu/pub/jellyfish/jellyfish-2.1.3.tar.gz Martin > > cheers, > > ~brian > > > On Fri, Jul 4, 2014 at 8:46 AM, Martin MOKREJŠ <mmo...@gm... <mailto:mmo...@gm...>> wrote: > > Brian Haas wrote: > > Greetings all, > > > > The latest release of TransDecoder is now available: > > > > http://sourceforge.net/projects/transdecoder/files/TransDecoder_r20140704.tar.gz/download > > > > including minor changes from the previous release to ensure better compatibility with other projects, including Trinity, PASA, and Trinotate > > > > Release notes: > > > > -added 'make simple' to build just the essential components involving parafly and cdhit > > > > -removed the 'cds.' prefix from the pep and cds sequence accessions. > > > Hi Brian, > I just tested the new and have some comments: > > 1. In the past the files were tar.bz2 instead of tar.gz as of now. It helps distro maintainers if the URLs and filenames remain stable. It is also a common habit that if one unpacks MyApp-2.4c.tar.gz that it extracts into MyApp-2.4c/ subdirectory. Although it seems your today's archive file complies with this I think trinity does not and not sure how long will it last. ;) > > 2. It is evil that the "make compile" step runs wget to download 1.4GB large PFAM file. Please put it under different "target" in your Makefile's. Not only, I already have the files on my system and I certainly do not want to waste my bandwidth. > > 3. I would like to add this to Gentoo Linux but that won't ever be allowed if the package is huge glue of other tools. For example, I have already cd-hit installed and installing TransDecoder would try to overwrite existing files, and will be denied. If you would like to get the package accepted into Linux distros and save developers time resolving the knotted layout, please introduce some configure- or Makefile-based checks and bail out if they are not installed. You can keep the crazy layout/setup as an alternative for users who think this is the right way to go (while it is not). > > 4. I wanted to post trinityrnaseq-users list about this but ... it is confusing that trinity and transdecoders place overlapping 3rd-party stuff under its own source tree. The http://trinityrnaseq.sourceforge.net/#installation page it totally quiet how all there hidden obstackles. I recommend you to sum up a simple listing of required/optional tools, their versions and URLs. If possible, drop them from the TransDecoder_r20140704/3rd_party and also from the plugins subdirectory somewhere under trinity*. > > 5. In the current setup, both transdecoder, trinity are un-manageable for a Linux distro. One cannot force some version dependencies, the tools download what they want to on their own instead of just running a compiler ... and tehy overwrite other applications files. > > 6. BTW, I realized quorum package looks for jellyfish-1.11 while on the web I found only jellyfish-2.x. Incidentally, I see jellyfish-1.11 under trinity*. Huh. Would you please tell me: whether trinity uses an "old" jellyfish version of teh same package? Or is that that incidentally same name? Why can't trinity use jellyfish-2.x installed already on the system. > > I wish it helps you and other devs to cleanup the interesting package, though I did not get to install it yet. > > Thank you, > Martin |
From: Brian H. <bh...@br...> - 2014-07-04 13:00:51
|
Thanks, Martin. I've CC'd the trinity-developers list. We'll take all your comments into consideration. It'll be some time before it all meets your expectations (if that's even an option for us). For now, our (perhaps my) goals have been quite simple: keep everything self-contained and minimize dependencies. I entirely agree about that evil wget (which is one of the reasons why I put the 'make simple' in there). This version will be incorporated as a plugin in the upcoming Trinity, along with Jellyfish-2 (even if currently to your dismay), and Trinity will do the 'make simple' to avoid pulling down pfam via wget. cheers, ~brian On Fri, Jul 4, 2014 at 8:46 AM, Martin MOKREJŠ <mmo...@gm...> wrote: > Brian Haas wrote: > > Greetings all, > > > > The latest release of TransDecoder is now available: > > > > > http://sourceforge.net/projects/transdecoder/files/TransDecoder_r20140704.tar.gz/download > > > > including minor changes from the previous release to ensure better > compatibility with other projects, including Trinity, PASA, and Trinotate > > > > Release notes: > > > > -added 'make simple' to build just the essential components involving > parafly and cdhit > > > > -removed the 'cds.' prefix from the pep and cds sequence accessions. > > > Hi Brian, > I just tested the new and have some comments: > > 1. In the past the files were tar.bz2 instead of tar.gz as of now. It > helps distro maintainers if the URLs and filenames remain stable. It is > also a common habit that if one unpacks MyApp-2.4c.tar.gz that it extracts > into MyApp-2.4c/ subdirectory. Although it seems your today's archive file > complies with this I think trinity does not and not sure how long will it > last. ;) > > 2. It is evil that the "make compile" step runs wget to download 1.4GB > large PFAM file. Please put it under different "target" in your Makefile's. > Not only, I already have the files on my system and I certainly do not want > to waste my bandwidth. > > 3. I would like to add this to Gentoo Linux but that won't ever be allowed > if the package is huge glue of other tools. For example, I have already > cd-hit installed and installing TransDecoder would try to overwrite > existing files, and will be denied. If you would like to get the package > accepted into Linux distros and save developers time resolving the knotted > layout, please introduce some configure- or Makefile-based checks and bail > out if they are not installed. You can keep the crazy layout/setup as an > alternative for users who think this is the right way to go (while it is > not). > > 4. I wanted to post trinityrnaseq-users list about this but ... it is > confusing that trinity and transdecoders place overlapping 3rd-party stuff > under its own source tree. The > http://trinityrnaseq.sourceforge.net/#installation page it totally quiet > how all there hidden obstackles. I recommend you to sum up a simple listing > of required/optional tools, their versions and URLs. If possible, drop them > from the TransDecoder_r20140704/3rd_party and also from the plugins > subdirectory somewhere under trinity*. > > 5. In the current setup, both transdecoder, trinity are un-manageable for > a Linux distro. One cannot force some version dependencies, the tools > download what they want to on their own instead of just running a compiler > ... and tehy overwrite other applications files. > > 6. BTW, I realized quorum package looks for jellyfish-1.11 while on the > web I found only jellyfish-2.x. Incidentally, I see jellyfish-1.11 under > trinity*. Huh. Would you please tell me: whether trinity uses an "old" > jellyfish version of teh same package? Or is that that incidentally same > name? Why can't trinity use jellyfish-2.x installed already on the system. > > I wish it helps you and other devs to cleanup the interesting package, > though I did not get to install it yet. > > Thank you, > Martin > -- -- Brian J. Haas The Broad Institute http://broad.mit.edu/~bhaas |
From: Martin M. <mmo...@gm...> - 2014-07-04 12:47:15
|
Brian Haas wrote: > Greetings all, > > The latest release of TransDecoder is now available: > > http://sourceforge.net/projects/transdecoder/files/TransDecoder_r20140704.tar.gz/download > > including minor changes from the previous release to ensure better compatibility with other projects, including Trinity, PASA, and Trinotate > > Release notes: > > -added 'make simple' to build just the essential components involving parafly and cdhit > > -removed the 'cds.' prefix from the pep and cds sequence accessions. Hi Brian, I just tested the new and have some comments: 1. In the past the files were tar.bz2 instead of tar.gz as of now. It helps distro maintainers if the URLs and filenames remain stable. It is also a common habit that if one unpacks MyApp-2.4c.tar.gz that it extracts into MyApp-2.4c/ subdirectory. Although it seems your today's archive file complies with this I think trinity does not and not sure how long will it last. ;) 2. It is evil that the "make compile" step runs wget to download 1.4GB large PFAM file. Please put it under different "target" in your Makefile's. Not only, I already have the files on my system and I certainly do not want to waste my bandwidth. 3. I would like to add this to Gentoo Linux but that won't ever be allowed if the package is huge glue of other tools. For example, I have already cd-hit installed and installing TransDecoder would try to overwrite existing files, and will be denied. If you would like to get the package accepted into Linux distros and save developers time resolving the knotted layout, please introduce some configure- or Makefile-based checks and bail out if they are not installed. You can keep the crazy layout/setup as an alternative for users who think this is the right way to go (while it is not). 4. I wanted to post trinityrnaseq-users list about this but ... it is confusing that trinity and transdecoders place overlapping 3rd-party stuff under its own source tree. The http://trinityrnaseq.sourceforge.net/#installation page it totally quiet how all there hidden obstackles. I recommend you to sum up a simple listing of required/optional tools, their versions and URLs. If possible, drop them from the TransDecoder_r20140704/3rd_party and also from the plugins subdirectory somewhere under trinity*. 5. In the current setup, both transdecoder, trinity are un-manageable for a Linux distro. One cannot force some version dependencies, the tools download what they want to on their own instead of just running a compiler ... and tehy overwrite other applications files. 6. BTW, I realized quorum package looks for jellyfish-1.11 while on the web I found only jellyfish-2.x. Incidentally, I see jellyfish-1.11 under trinity*. Huh. Would you please tell me: whether trinity uses an "old" jellyfish version of teh same package? Or is that that incidentally same name? Why can't trinity use jellyfish-2.x installed already on the system. I wish it helps you and other devs to cleanup the interesting package, though I did not get to install it yet. Thank you, Martin |