--train <string> file format question

Extracting likely coding regions from transcript sequences

Brought to you by: bhaas

--train <string> file format question

Forum: General Discussion

Created: 2014-07-23

Updated: 2014-07-23

David - 2014-07-23

Hi,

I'd like to provide my own training file to TransDecoder using the --train option. At the top of the file it says I should input it in FASTA format, but I still have a couple of doubts:

For each fasta sequence in the training file, can I include the UTR regions of the transcript or does it have to be just the CDS?

Do I need to input any kind of coordinates, say for example in the sequence description? (e.g. >seq_id left_pos right_pos)

Thanks in advance
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.