[Transdecoder-users] missing_CDS/peps in TransDecoder output
Extracting likely coding regions from transcript sequences
Brought to you by:
bhaas
From: Irantzu A. <ira...@gm...> - 2014-12-16 11:12:52
|
Hi all, I've recently started using TransDecoder, and I've a general question, sorry if it is very basic. In the final output of transdecoder, the .pep file has a total of 200 peptides, while my initial transcript.fasta file has 375. I know that this is probably because these 175 remaining transcripts have not ORFs. Or maybe they've but they not accomplish the minimum length of open reading frame. *Questions:* *1)* How much is this minimum length of ORF? *2)* Is there any other reason for not having the CDS/peptide hit? Some of the transcripts in transcripts.fasta file are small, 100-200 bp, maybe this is affecting somehow? *3)* For other hand, to produce the transcripts.fasta file, I've used a gtf file with gffread software. Is this OK? Because in TransDecoder webpage, I have read this part: "Starting from a genome-based transcript structure GTF file (eg. cufflinks)", and it says to convert gtf to gff3 ...etc. But, the transcript fasta sequence should be the same despite if you get it from GTF or GFF3, isn't it? Thanks in advance, Irantzu -- *Irantzu Anzar* M.Sc. in Bioinformatics, Autonomous University of Barcelona, Spain B.Sc. in Biotechnology, University of León, Spain |