Menu

#64 Propagation of internal error codes

1.0
open
None
2023-11-15
2023-11-15
No

Hi. As you may have inferred from ticket #63. I am trying to build a FASTQ QC pipeline such that I do not spend additional compute on FASTQs predestined to fail for structural reasons. One of the edge cases I aim to cover is if a gzipped FASTQ is incompletely downloaded. Here is the snippet:

$ ./bbmap/reformat.sh in=unit_test_fq/reads-corrupted_R1.fastq.gz in2=unit_test_fq/reads-corrupted_R2.fastq.gz 
java -ea -Xmx300m -Xms300m -cp /home/max/fastq_qc_redux/bbmap/current/ jgi.ReformatReads in=unit_test_fq/reads-corrupted_R1.fastq.gz in2=unit_test_fq/reads-corrupted_R2.fastq.gz
Executing jgi.ReformatReads [in=unit_test_fq/reads-corrupted_R1.fastq.gz, in2=unit_test_fq/reads-corrupted_R2.fastq.gz]

No output stream specified.  To write to stdout, please specify 'out=stdout.fq' or similar.
Set INTERLEAVED to false
[E::bgzf_read] Read block operation failed with error 4 after 5 of 65280 bytes
Error 5 in block starting at offset 33(21)
Input is being processed as paired
[E::bgzf_read] Read block operation failed with error 4 after 5 of 65280 bytes
Error 5 in block starting at offset 33(21)
Input:                          0 reads                 0 bases
Output:                         0 reads (NaN%)  0 bases (NaN%)

Time:                           0.721 seconds.
Reads Processed:           0    0.00k reads/sec
Bases Processed:           0    0.00m bases/sec

As you can see the [E::bgzf_read] printout is returned. What is problematic is that the exit code returns successful.

$ echo $?
0

I can redirect this text to stdout and use grep, awk, etc.. to handle such errors, but my question is how nontrivial of a lift would it be to propagate error codes for what I feels similar to HTSlib-esque calls underneath.

Forgive me again. I tried to use chatGPT to discern what was going on in the source, but I'm not fluent in java.

As always, thank you for your time and help.

Discussion

  • Max Rozenblum

    Max Rozenblum - 2023-11-15

    Another alternative solution for me might just be to run gzip -dct ${FASTQ_PATH} on each gzipped fastq I'd like to analyze. This will catch gzip corruption. However it might be useful to propagate errors like these directly through bbmap suite.

     
  • Brian Bushnell

    Brian Bushnell - 2023-11-15

    Good suggestion; I'm opening a new process for bgzip and piping the input. Shouldn't be too hard to catch the error code.

     

Log in to post a comment.