Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#258 pacBioToCA overwrites uncorrected input reads with result if named pacbio.fastq

correction
closed-fixed
nobody
None
5
2014-03-06
2013-10-10
eernst
No

Doh! I guess it didn't occur to me that this would happen. It might be better to write the output as basename(input).cor.fastq, or to a subdir instead.

$ pacBioToCA -length 500 -partitions 200 -l pacbio -t 16 -s pacbio.spec -fastq pacbio.fastq longReads=1 genomeSize=13000000 > pacbiotoca.log 2>&1

...
----------------------------------------START Wed Oct 9 20:38:35 2013
cp asm.layout.err /asm/v1.0-pacbiotoca//pacbio.correction.err
Total split bases is 20197102 vs 650000230 so ratio is 0.0310724536205164
----------------------------------------END Wed Oct 9 20:38:35 2013 (0 seconds)
----------------------------------------START Wed Oct 9 20:38:35 2013
cp asm.layout.hist /asm/v1.0-pacbiotoca//pacbio.correction.hist
----------------------------------------END Wed Oct 9 20:38:35 2013 (0 seconds)
----------------------------------------START Wed Oct 9 20:38:35 2013
cat ls [0-9]*.fasta |grep trim |sort -T . -rnk1 > /asm/v1.0-pacbiotoca//pacbio.fasta
----------------------------------------END Wed Oct 9 20:38:38 2013 (3 seconds)
----------------------------------------START Wed Oct 9 20:38:38 2013
cat ls [0-9]*.qual |grep trim | sort -T . -rnk1 > /asm/v1.0-pacbiotoca//pacbio.qual
----------------------------------------END Wed Oct 9 20:38:44 2013 (6 seconds)
----------------------------------------START Wed Oct 9 20:38:44 2013
cat ls [0-9]*.fastq |grep trim | sort -T . -rnk1 > /asm/v1.0-pacbiotoca//pacbio.fastq
----------------------------------------END Wed Oct 9 20:38:48 2013 (4 seconds)
----------------------------------------START Wed Oct 9 20:38:48 2013
/usr/local/src/wgs-dev/Linux-amd64/bin/fastqToCA -libraryname pacbio -technology pacbio -type sanger -reads /asm/v1.0-pacbiotoca//pacbio.fastq > /asm/v1.0-pacbiotoca//pacbio.frg
----------------------------------------END Wed Oct 9 20:38:48 2013 (0 seconds)
...

Discussion

  • Sergey Koren
    Sergey Koren
    2013-10-10

    The output name is controlled by the -l parameter so adding -l pacbio.cor would avoid overwriting the file. However, I updated the code to stop the pipeline if the output file already exists so the -l parameter can be changed or the file renamed.

     
  • Sergey Koren
    Sergey Koren
    2014-03-06

    Fixed in CA 8.1

     
  • Sergey Koren
    Sergey Koren
    2014-03-06

    • status: open --> closed-fixed