Celera Assembler 6.1 does not preserve the read UID that is in the fastq file of Illumina reads. That was an understandable design choice, as the read UID can be very long. However, the current configuration makes it nearly impossible to compare fastq and asm files. Suppose we'd like to compare the clear range in a trimmed read (AFG message in the ASM file) to the original read in the fastq file. This is hard or impossible because CA assigns an arbitrary UID to each read. Furthermore, it assigns the same UID to both reads of a pair. (Asside: for unpaired Illumina reads, are there DST messages to link each read to its library?)
Here are some suggestions. Generate a file of UID to IID during gatekeeper and copy that to the 9-terminator directory. Alternately, preserve the Illumina read ID in the gkpStore, even though it is long. Alternately, extract the variable portion of the Illumina read ID and use that; don't store the portion of the read name that encodes the run ID.