I am running pacBioToCA to correct reads using a Illumina HiSeq dataset.
Unfortunately I run into some troubles. The gatekeeper error log ("asm.gkpStore.err") contains the following error messages:
Processing SINGLE-ENDED SANGER QV encoding reads from:
GKP finished with 578612 alerts or errors:
540303 # ILL Error: not a sequence start line.
38309 # ILL Error: not a quality start line.
The only information I can find about this error is a fairly recent post on SeqAnswers: http://seqanswers.com/forums/showthread.php?t=24916
Here the original poster states that "The ILL errors are thrown because of read lengths above 2047 bps. A bug that's supposed to be fixed since wgs-7.0."
I (re)installed the latest stable Celera release on our computer cluster but this did not solve the problem. However, gatekeeper does not crash and the assembly continues after the gatekeeper is done. I am concerned about this error. Does it imply that any (PacBio) read longer than ~2kb will not be loaded? How can I solve this issue?
Any help is greatly appreciated.