I'm running Scalpel but getting almost immediate failure/exit from the program and I'm not sure why. Red flags are "undefined sequence" and "command failure" I'm quite new to Linux commands so I can't really tell if it's my input files or an installation problem . Here's the detailed output, can anyone assist? INPUT: $ perl /mnt/d/scalpel-0.5.4/scalpel-discovery --single --bam /mnt/d/scalpel-0.5.4/MC-files/MC-fish.marked_duplicates.bam --bed 15:1-48040578 --ref /mnt/d/scalpel-0.5.4/MC-files/danRer11.fa...
Hello, new user here. I downloaded the software packages and also the first two sample FASTQ files. But these files don't download. The error is "no such directory". Is there a solution? ERR194151_1.fastq.gz ERR194151_2.fastq.gz ERR324432_1.fastq.gz ERR324432_2.fastq.gz $ wget --no-check ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194151/ERR194151_1.fastq.gz --2021-01-08 10:17:53-- ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194151/ERR194151_1.fastq.gz => ‘ERR194151_1.fastq.gz’ Resolving ftp.sra.ebi.ac.uk...
Hi Clara, I had the same issue which is more related to the bioconda recipe than the Scalpel tool itself. You should try again now. See this issue for more: https://github.com/bioconda/bioconda-recipes/issues/19844
Hi, I am having the following error while trying to run sclapel somatic-discovery: Command failure: microassembly (/rds/general/user/cd3417/home/anaconda3/envs/analysis_Clara/share/scalpel-0.5.4-1/Microassembler/Microassembler -R -I -C -F 1000000 -k 25 -K 90 -M 3 -l 25 -c 5 -x 0.01 -d 2 -u 10000 -b 1 -p /outdir/main/tumor -m bamfile.bam -g readgroups.txt -r regions-0-401215.fa >/outdir/main/tumor/logs/regions-0-401215.fa.log 2>&1)... ** 0 just got out of the pool [PID: 2433299 | exit code: 0 | exit...
Oh okay. I think I got you. What you meant by re-genotyping is basically running Scalpel again with a new bed file provided that now we have information on all variants, so we can create a new bedfile centered around those variants. I tried this option and it does recover variants for a few cases that were mis-called as Het, even though were pure homozygous variants based on the window size. However, for all the other cases where True variants are supposed to be at 95% (we can determine this VAF...
Oh okay. I think I got you. What you meant by re-genotyping is basically running Scalpel again with a new bed file provided that now we have information on all variants, so we can create a new bedfile centered around those variants. I tried this option and it does recover variants for a few cases that were mis-called as Het, even though were pure homozygous variants based on the window size. However, for all the other cases where True variants are supposed to be at 95% (we can determine this VAF...
What I meant was running "scalpel-discovery" to re-genotype the final list of variants, where the BED input file contains the list of the windows centered at each variant location.
Thanks for your reply. When you said to re-run Scalpel on the final list of variants, did you mean running scalpel-export? I tried running the export command but for the same results. "8 48805816 . A AG . PASS AVGCOV=1233.0;MINCOV=1233;ALTCOV=1269;ZYG=het;COVRATIO=0.49;CHI2=0.52;FISHERPHREDSCORE=0;INH=na;BESTSTATE=na;COVSTATE=na GT:AD:DP 0/1:1269,1233:2502" scalpel-export --single --db variants.db.dir --bed $bed --ref human_g1k_v37_.fasta Yes, you are right. Fixing window size might rescue this variant...
Hi Ashini, thank you for reporting this issue. Indeed the window size can impact the genotype. I would suggest to perform one more round of genotyping at the end by rerunning scalpel on the final list of variants. I would fix the window to 400 bp for all variants and center it at the variant location to make sure that all the variants are processed somehow consistently. As you already suggested, using a centered window helps reducing these problems, although it will likely not fix all of them. Hope...
Scalpel calling homozygous variants as Het
When i run Scalpel on sample data, I get the following error. I am using a bed file of the format chromosome_number:start-end Loading targets from BED file...Use of uninitialized value $chr in string eq at /u/home/v/vsarwal/codes/Scalpel/scalpel-0.5.3/SequenceIO.pm line 209, <EXONSLIST> line 2. Use of uninitialized value in hash element at /u/home/v/vsarwal/codes/Scalpel/scalpel-0.5.3/SequenceIO.pm line 224, <EXONSLIST> line 2. Use of uninitialized value in string eq at /u/home/v/vsarwal/codes/Scalpel/scalpel-0.5.3/SequenceIO.pm...
I've run scalpel-discovery in single mode on 22 samples. The samples were run in parallel with identical commands; this is the discovery command I used: cd ${sample} ~/local/bin/scalpel-0.5.3/scalpel-discovery --single \ --bam ./${sample}.bam \ --bed ${bed} \ --ref ${ref} \ --window 500 \ --logs --mapscore 20 One of those samples is failing repeatedly. The relevant part of the output looks like this: START ANALYSIS -- Print parameters to ./outdir/parameters.txt Loading targets from BED file...12769...
Hello, we are using Scalpel to call indels on DREAM dataset 4 as a test and parallelized it by chromosome. While jobs for majority of the chromosomes run fine, few give errors and do no produce an output VCF. I've tried rerunning the jobs multiple times and it's not the same set of chromosomes that produce errors. I was hoping to get some suggestions on figuring out what the issue is. Here's the command: scalpel-discovery --somatic --normal synthetic.challenge.set4.normal.bam --tumor synthetic.challenge.set4.tumour.bam...
updated manual
add parameter to set random seed; results using the same seed should be identical
update changelog; new version v0.5.4
Hi, I'm trying to run the scalpel-discovery --somatic analysis, however the process does not seem to complete. I submit the following shell to our server with %qsub -l h_vmem=40g shell.sh ~/Programs/scalpel-0.5.3/scalpel-discovery --somatic --normal ~/Projects/Scalpel/scalpel_01vehicle_sort_markdup.bam --tumor ~/Projects/Scalpel/scalpel_11combo_sort_markdup.bam --bed ~/Projects/Scalpel/mm10_codingexons.bed --ref ~/Projects/Scalpel/mm10.fa --numprocs 10 --two-pass The output is as follows: Local date...
I use linux to install the Scalpel. Also met the same problem like yours. Have you fixed the problem?
Glad it worked!
Pleass try with an older version of the GCC compiler such as v4.8.x Some people have reported similar problems with the more recent 5.3 compiler: https://github.com/Homebrew/homebrew-science/issues/3929
Hi, Nsrzisi,Thank you! You advices do help me ! I used gcc4.1 to compile the scalpel and it works !
2 different REF bases reported by Scalpel
Okay.. Sounds good. Thank you so much for all your help. I really appreciate you helping me with this issue. Ashini
Unfortunately I don't have a solution for the 2 ref bases issue. Yes, it would be good to flag this scenario when encountered and manually inspect. Another option is to filter them out after they are called if you don't trust such variants.
Is CCT the inserted sequence shown in the IGV for all the alignments? Or just CT? It is possible that scalpel is joining together the two variants into one. Also this is within a long streatch of homopolymer C and I would be suprised if there is no subset of reads that show a deletion of a C also. The deletion combined with the T variant may cause the T to shift down of one base and, the alignment of the assembled sequence to the reference, could cause the MT 309 . T TC When you run scalpel on a...
Pleass try with an older version of the GCC compiles such as v4.8.x Some people have reported similar problems with the more recent 5.3 compiler: https://github.com/Homebrew/homebrew-science/issues/3929
So, in this particular scenario, do you have any recommendation on how to have a general approach to solve this problem to seeing 2 ref bases or does this have to be a special case that could be flagged and manually evaluated?
Thanks, Nsrzisi, I have compiled bamtools-2.3.0 with /cmake-3.5.0 and gcc5.10.it has repoerted like that: [ 0%] Built target SharedHeaders [ 0%] Built target AlgorithmsHeaders [ 0%] Built target APIHeaders [ 1%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamAlignment.cpp.o [ 2%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamMultiReader.cpp.o [ 3%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamReader.cpp.o [ 4%] Building CXX object src/api/CMakeFiles/BamTools-static.dir/BamWriter.cpp.o...
Weiliu, can you check if Bamtools was compiled correctly within the scalpel ditribution? Also, what compiler version are you using?
It is possible, in rare cases, to get a different set of variants when using different regions sizes. This is because scalpel automatically adjust the k-mer size (to build a cycle free deBruijn graph) as a function of the sequnce compositon of the region. In your case the larger region may contain a more complecated repeat that requires a larger k-mer. But a larger k-mer makes the tool less sensitive to very low coverage variants, which is likely what it is happening here.
How you solve this problems? I have encountered the same problem! Could you give me some advice? g++ -std=c++0x -Wno-deprecated -Wall -O3 -fexceptions -Wl,-rpath,/lustre/user/wwlab05/bin/scalpel-0.5.3/bamtools-2.3.0/lib/ -I/lustre/user/wwlab05/bin/scalpel-0.5.3/bamtools-2.3.0/include/ -L/lustre/user/wwlab05/bin/scalpel-0.5.3/bamtools-2.3.0/lib/ Microassembler.cc Edge.cc Node.cc Graph.cc Path.cc ContigLink.cc align.cc util.cc -o Microassembler -lbamtools -lz /tmp/ccaNVOeh.o: In function Microassembler::run(int,...
How you solve this problems? I have encountered the same problem! Could you give me some advice?
Ah, you are right. There is an insertion for CT. only few reads support CCT. Also, where do you see a deletion. I can only see insertions here. Moreover, since our read length are 150bp, I played with different regions sizes. When I used MT:100-500, I get all 5 variants including both refs "C" and "T" like I would get with a bed. However, when I use MT:1-650 (exact 3 times read length), I get only 3 variants (loosing MT:308 C>CCTand MT:309 T>TC variant). So, this seems to be very sepcific to the...
Is CCT the inserted sequence shown in the IGV for all the alignments? Or just CT? It is possible that scalpel is joining together the two variants into one. Also this is within a long streatch of homopolymer C and I would be suprised if there is no subset of reads that show a deletion of a C also. The deletion combined with the T variant may cause the T to shit down of one base and, the alignment of the assembled sequence to the reference, could cause the MT 309 . T TC When you run scalpel on a small...
Thanks for your reply. We are using the same refernece to align reads and to run Scalpel. So, the first scenario is ruled out. Second, Looking at the bam in IGV (snapshot attached0, I can see an insertion (bases CCT) happening between 309-310 position and a SNP at 310 (T>C), so the SNP is right after the insertion. Even though, scalpel extract the previous base from the assembled sequence, shouldn't that effect "ALT", because the assmeble sequence will reflect the alternate base, not the reference...
Hi Ashini, a similar problem was reported in the past by another user but at a different location: https://sourceforge.net/p/scalpel/bugs/7/ even MuTect was reported to preduce the same error at your same location MT:309 : https://github.com/chapmanb/bcbio-nextgen/issues/1362 I am listing below a few possible scenarios that can couse this problem: The reference used to align the reads may be different from the one used to run scalpel. Please double check that you are provinding the exact same reference...
Exit code of Scalpel is 0 when child process fails or when external application (samtools, bcftools) fails.
2 different REF bases reported by Scalpel
I have this problem in my calling,have you solve it?please
Hi, Giuseppe, Thank you so much for the reply and glad to hear that you will address...
Hi Loubin, thank you for reporting the bug! I recently become aware of this issue...
Exit code of Scalpel is 0 when child process fails or when external application (samtools, bcftools) fails.
I tried to install Scalpel on my iMac and got in the end next message: In file included...
Hi Han, Thanks for the sharing the article. It will be pretty useful. We look at...
Hi Ashini, As per our supplemental figure 4 from "Reducing INDEL calling errors in...
Great. thats perfect. Thank you for your reply. Ashini
Hi Ashini, the quality score for a variant in single mode is availabe as CHI2 (chi-squared...
Quality scores missing when run as a single sample
Manual
[update] add descriptions of clinvar bed file i...
updated master script in protocol bundle; added...
Manual
Manual
Manual
Manual
[update] update the protocol bundle tar ball
Compilation error (Microassembler)
scalpel output vcf with inconsistent reference
Thank you for the update Endre. Glad the source of the probolem was found. Maybe...
Please see the comments at github. The use of samtools in FindVariants.pl seems to...
Hmmm, so if I download scalpel 0.5.1, compile and run it with exactly the same parameters...
Manual
This is pazzling! I downloaded the referece (human_g1k_v37.fasta and human_g1k_v37.decoy.fasta,...
Hi! Brad Chapman can reproduce the error that I also see, both with 0.5.1 and 0.5.3....
remove older bundle tar ball
[update] update the resource bundle to version ...
[update] update the resource bundle to version ...
[update] update the resource bundle to version ...
Thanks for checking this. I used the bcbio pipeline for the bwa alignment and running...
The reson for the error is that the previous base reported for one of your variants...
Hi Yannick, for some reason the comapiler is not able to find the BamTools library....
Hi Yannick, for some reason the comapiler is not able to find the BamTools library....
Thank you Endre for your feedback. I will look into it and, if needed, I will submit...
Compilation error (Microassembler)
scalpel output vcf with inconsistent reference
updated changelog
fixed error when providing empty BED file
faster graph traversal for finding repeats; 20%...
updated version number to 0.5.3
updated version number to 0.5.2.1
fixed missing sample name in VCF header when no...
fix
Manual
small fix
bug in two-pass mode: supporting coverage in no...
updated version number
Manual
updated manual and usage
Manual
Manual
Manual
Manual