Jimmy, We employ Comet in a clinical Proteomics pipeline at Mayo. We would like deterministic results to adhere to great patient quality. We would be happy to submit a pull request in order to facilitate this, if you are too busy. Can you provide some hints on the affected variables / threads?
update to GA4 tag
fix link to percolator.ms
addressed in branch: https://sourceforge.net/p/comet-ms/code/HEAD/tree/branches/release_2020010_cyclic/
support searching cyclic peptides
Search light/heavy SILAC as mixed spectrum
addressed in branch: https://sourceforge.net/p/comet-ms/code/HEAD/tree/branches/release_2019015_silacpair/
allow specifying two sequence databases for a search
addressed with GH commit 75d837
Annotate decoy sequences in Percolator pin output when not using Comet's internal decoys
clip N-term methionine and PEFF
addressed with git commit 450b6a
Add mzIndentML output support
Initial mzIdentML support in release 2020.01.0 released 2020/11/09
allow specifying two sequence databases for a search
update website content
propagate r1620 fix to SILAC pair branch so that paired peaks can be scored even if base fragment ion peak is not matched.
simple change to the version string for this variant of the tool
fix to allow add_fragment_masses to match when base fragment ion is not present
add_fragment_masses should now be functional with these updates
Initial code changes for applying user defined fragment mass additions/offsets via the "add_fragment_masses" parameter. Extends the 4th field of uiBinnedIonMasses to allow up to 9 fragment mass additions. Not supported for indexed searches.
add image
delete unused branch
create iqPIR branch from silacpair code
delete; wrong location
website files
create iqPIR branch from silacpair code
Add in code to calculate xcorr & E-values for base peptide fragments and paired only peptide fragments.
Grab the SILAC pair mass difference from the first variable mod entry with a lysine residue. This replaces the hardcoded 8.014199.
fix the total ions calculation
add lnExpectPair column to Percolator pin output
track and store xcorr and E-value for contributions from just the paired fragments
little mstoolkit readme update
use an actual correct variable name in total ion count calculation
mzIdentML: change "DBSequence_ref" to "dBSequence_ref" thx to note by A. Collins.
Add "cyclic_NL_ms_level" parameter to control applying cyclic neutral loss to MS2, MS3, or both. Add support to consider/analyze higher charged fragments. Calculate total ions correctly for linear vs. cyclic peptides. Still need to address Sp score, matched ions, and E-value calculation.
Correct the # of reported total ions. Previously the number was calculated based on fragment charge states through (precursor_charge - 1) but this did not take into account the max_fragment_charge parameter entry whose value is often less than (precursor_charge - 1).
Issue is current implementation checks szProteinSeq (which has start methionine clipped off) against the original dbe->strSeq within StorePeptide(). With a PEFF, szProteinSeq will now also contain a substitution so the strcmp() fails. Need to do a strcmp after substituting back in the original sequence unless I implement a more direct way around this round-about check like simply setting a new variable.
merge from branches/release_2021010 encapsulating r1586 thru r1598
Issue is current implementation checks szProteinSeq (which has start methionine clipped off) against the original dbe->strSeq. With a PEFF, szProteinSeq will now also contain a substitution so the strcmp() fails. Need to do a strcmp after substituting back in the original sequence unless I implement a more direct way around this round-about check like simply setting a new variable.
tagging release 2021.01 rev. 0
delete tagged release to fix sequence pre-load error
remove errant comment on line limiting pre-loading to 500 sequences
tagging release 2021.01 rev. 0
delete 2021.01.0 tagged release
add a cast to allow VS2019 win32 compile thx to fix by D. Shteynberg
tagging release 2021.01 rev. 0
cleanup unused declarations in header file
bandaid workaround until there's a fix for precuror NL peaks bug where _uiBinnedPrecursorNL values are < 0
remove print line
Add in safety check and error message in XcorrScore() prompted by large negative "bin" value when "precursor_NL_ions = 1" is set.
In XcorrScore, the precursorNL "bin" has intermittently been observed to be a large negative value; haven't tracked down why though.
add no digestion entry to comet.params
update the README to document using v142 build tools in VS2019
update CometUI.pfx temporary key
since no parameters change, allow the use of 2020.01 params files also
create 2021.01.0 branch
Not sure if I can replicate this anymore.
add preliminary 2021.01.0 release docs; nothing updated yet
Extend protein sorting function ProteinEntryCmp() to consider peptide start position in order to get the first set of flanking residues when a peptide is present in a protein multiple times.
In CalculateSP(), evaluate code below. double dNewMass = dBion + g_staticParams.massUtility.pdAAMassFragment[(int)pOutput[i].szPeptide[iLenMinus1]] + (pOutput[i].piVarModSites[iLenMinus1] != 0 ? pOutput[i].pdVarModSites[iLenMinus1] : 0.0) + g_staticParams.staticModifications.dAddCterminusProtein + Oxygen_Mono + Hydrogen_Mono + Hydrogen_Mono;
investigate ThreadPool
David implemented new ThreadPool beginning with r1541
David implemented new ThreadPool.
add "old_mods_encoding" parameter entry to write modified peptide strings using the old mod character encodings for .sqt output
update version string; use wait_on_threads() instead of sleep() in CometSearch()
merge from branches/20210524-ThreadPoolUpdate
usleep() is deprecated; use nanosleep()
remove CheckForUpdates from Makefile; use portable sleep
given new ThreadPool implementation, add logic to not load all protein sequences at once to better match previous implementation behavior
remove unused variable
minor change of performance debugging reporting
Have CometSearchManager allocate the ThreadPool only once per execution of comet...
Remove thread pool from DoSingleSpectrumSearch() which entails overloading RunSearch().
Update to Windows line terminations.
Code clean up; remove unused min/max threads arguments from RunSearch(), PostAnalysis(), etc.
Revert for loop range for E-value calculation from r1555 check in.
fix formatting to conform to project coding style
Skip thread pool initialization for DoSingleSpectrumSearch(). Upgrade projects to VS2019 v142.
fix logic for y-ion silac fragment pairs only
add option to use y-ion silac fragment pairs only by setting "silac_pair_fragments = 2"
The spectrum batch loading was exiting prematurely due to checking total scans analyze vs. spectrum batch size; removing this check restores proper behavior.
revert the for-loop bounds change
remove histograms from text output and fix for-loop bounds in E-value calculation
remove return value for function of type void
Search progress percentage should be updated when the thread finishes, so the percentage completed is meaningful ...
New ThreadPool uses exact number of threads, the main thread also runs the jobs, tested with 1 to maximum threads, appears fractionally faster than trunk code and will finish even of giant mzML files like those that come out of timsTOF DISCo output ...
replace usleep() and sched_yield() with standard counterparts
add a wait to the spectra loading to limit the number of spectra loaded to a number near the spectrum_batch_size as with the previous implementation
Makefile fix
add "-std=c++11" to Makefiles to allow linux compile
address the new E-value calculation inconsistency across replicate runs
remove -g compile option from MSToolkit's Makefile
revert trunk to r1543 prior to ThreadPool changes
create branch for David's ThreadPool update