I'm trying to use assign-confidence to compute FDRs from my own scoring function, and I've made a tab-delimited file with a list of PSMs for assign-confidence to score. An example is attached. When I run assign-confidence as so:
/net/noble/vol2/home/kfattila/proj/crux-developer-rellin/src/crux_release assign-confidence --overwrite T --peptide-level T --score "exact p-value" --smaller-is-better F --decoy-prefix D_ test_input.txt
I get the following error:
INFO: Beginning assign-confidence.
INFO: Writing results to output directory 'crux-output'.
INFO: CPU: ornithine.gs.washington.edu
INFO: Tue Jul 7 15:31:44 PDT 2015
WARNING: empty protein id string in tab delimited file. searching database to find proteins to match peptide sequence
FATAL: failed to create a database_peptide_iterator,no proteins in database
I talked to Attila about this, and he says that it's a problem with the parsing of the file and not particular to assign-confidence. He mentioned that you wrote the parsing code. Do you know what's going on? Thanks a lot Sean!
UPDATE 07/14/15:
The input was faulty in that the column name was "protein ID" instead of "protein ID". Fixing this resolves this particular bug. The program should be changed to check for the existence of the column and then descriptively complain if the column "protein id" does not exist.
I resolved this issue! The problem was that my input file contained a column name "protein ID" when it should actually be "protein id".
Diff:
Diff:
I am going to close this ticket and then open a new one that addresses a more general problem, of which this represents a special case.