Don Cameron - 2017-12-20

Hi,

I'm trying to run the scalpel-discovery --somatic analysis, however the process does not seem to complete. I submit the following shell to our server with %qsub -l h_vmem=40g shell.sh

~/Programs/scalpel-0.5.3/scalpel-discovery --somatic --normal ~/Projects/Scalpel/scalpel_01vehicle_sort_markdup.bam --tumor ~/Projects/Scalpel/scalpel_11combo_sort_markdup.bam --bed ~/Projects/Scalpel/mm10_codingexons.bed --ref ~/Projects/Scalpel/mm10.fa --numprocs 10 --two-pass

The output is as follows:

Local date and time: Mon Dec 18 10:27:59 2017

Program: scalpel-discovery (micro-assembly variant detection)
Version: 0.5.3 (beta), January 25 2016
Contact: Giuseppe Narzisi gnarzisi@nygenome.org

MAIN ANALYSIS
-- Print parameters to /home/dcameron/outdir/main/parameters.txt
-- Detect mutations on normal
-- Print parameters to /home/dcameron/outdir/main/normal/parameters.txt
Indexing BAM file...
Loading targets from BED file...389758 targets (filtered 25027 overlapping).
Loading genome from FASTA file...66 sequences.
Assembly Exons
start assembly of 544399 regions.
stepSize: 54440

  1. [0..54439]
    ** 0 started, pid: 7832
    ** 54440 started, pid: 7833
  2. [54440..108879]
    ** 108880 started, pid: 7834
  3. [108880..163319]
    ** 163320 started, pid: 7837
  4. [163320..217759]
  5. [217760..272199]
    ** 217760 started, pid: 7841
    ** 272200 started, pid: 7843
  6. [272200..326639]
    ** 326640 started, pid: 7846
  7. [326640..381079]
    ** 381080 started, pid: 7848
  8. [381080..435519]
    ** 435520 started, pid: 7850
  9. [435520..489959]
    ** 489960 started, pid: 7852
  10. [489960..544398]

While the job completes, there is no database file created. Only 10 very large fasta files in the outdir/main/normal directory, one example being "regions-0-54440.fa", and assorted log files.

I'd appreciate any advice you could give. Also, I wanted to check if the software worked using the examples in the Fang, 2016 paper. However, the ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194151 folder no longer exists.

Thanks,
Don