[Transdecoder-users] File for Training
Extracting likely coding regions from transcript sequences
Brought to you by:
bhaas
From: 卢 汉斌 <lh...@gm...> - 2014-05-02 09:37:45
|
Hello, I try to find coding regions within transcripts using TransDecoder. I want to use the close species (with relatively detailed genome annotation) to train Markov Mod for protein identification. I don’t quite understand what kind of file should I transmit to “—train” option, whether the annotation protein FASTA file or annotation CDS FASTA file of the close species? Also, the annotation proteins of this close species still contain a portion of low confidence genes. Should I filter out those low confidence genes or pick up some high confidence ORFs for training Markov Mod? Thank you for you advise. Best, David |