[svtoolkit-help] Genome Strip 2 using wrong bin size
Status: Beta
Brought to you by:
bhandsaker
From: Wakeling, M. <M.W...@ex...> - 2018-01-30 19:52:26
|
Hi. I'm trying to run the LCNV detection pipeline from Genome Strip 2 on a set of 237 whole genome samples. I have performed the PreProcess and GenerateDepthProfiles stages, and am trying to run the LCNVDiscoveryPipeline stage. When I run queue, it reports that it has created 19993 jobs, however all the jobs then fail. For instance, the first job is: INFO 08:31:37,845 FunctionEdge - Starting: 'Rscript' '/gpfs/ts0/home/mw501/Research_Project-MRC147594/genome_sequencing/genome_strip2/svtoolkit/R/lcnv/lcnv_scan.R' '--profileFile' 'profiles_10000/profile_seq_1_728.dat.gz' '--targetSample' 'WG0007' '--maxDepth' '60' '--ploidyMapFile' '/gpfs/ts0/home/mw501/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.ploidymap.txt' '--genderMapFile' 'output_metadata_directory/sample_gender.report.txt' '--outputFile' 'lcnv_output/seq_1/1_WG0007.dat' and this fails because there is no file named profiles_10000/profile_seq_1_728.dat.gz - but there is a file named profiles_10000/profile_seq_1_10000.dat.gz. I'm not sure where it is getting the 728 from. The arguments for the three stages are: export SV_DIR=~/Research_Project-MRC147594/genome_sequencing/genome_strip2/svtoolkit classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar" java -Xmx4g -cp ${classpath} \ org.broadinstitute.gatk.queue.QCommandLine \ -S ${SV_DIR}/qscript/SVPreprocess.q \ -S ${SV_DIR}/qscript/SVQScript.q \ -cp ${classpath} \ -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \ -configFile ${SV_DIR}/conf/genstrip_parameters.txt \ -R ~/Research_Project-MRC147594/resources/grch37/human_g1k_v37.fasta \ -I bams.list \ -md output_metadata_directory \ -bamFilesAreDisjoint true \ -copyNumberMaskFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.gcmask.fasta \ -genderMaskBedFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.gendermask.bed \ -genomeMaskFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.svmask.fasta \ -ploidyMapFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.ploidymap.txt \ -readDepthMaskFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.rdmask.bed \ -jobLogDir logDir \ -jobRunner ParallelShell \ -maxConcurrentRun 16 \ -run export SV_DIR=~/Research_Project-MRC147594/genome_sequencing/genome_strip2/svtoolkit classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar" java -Xmx4g -cp ${classpath} \ org.broadinstitute.gatk.queue.QCommandLine \ -S ${SV_DIR}/qscript/profiles/GenerateDepthProfiles.q \ -S ${SV_DIR}/qscript/SVQScript.q \ -cp ${classpath} \ -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \ -R ~/Research_Project-MRC147594/resources/grch37/human_g1k_v37.fasta \ -md output_metadata_directory \ -profileBinSize 10000 \ -maximumReferenceGapLength 1000 \ -runDirectory profiles_10000 \ -genomeMaskFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.svmask.fasta \ -ploidyMapFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.ploidymap.txt \ -jobLogDir profiles_10000/logDir \ -jobRunner ParallelShell \ -maxConcurrentRun 16 \ -run export SV_DIR=~/Research_Project-MRC147594/genome_sequencing/genome_strip2/svtoolkit classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar:${SV_DIR}/lib/gatk/Queue.jar" java -Xmx4g -cp ${classpath} \ org.broadinstitute.gatk.queue.QCommandLine \ -S ${SV_DIR}/qscript/discovery/lcnv/LCNVDiscoveryPipeline.q \ -S ${SV_DIR}/qscript/SVQScript.q \ -cp ${classpath} \ -gatk ${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar \ -R ~/Research_Project-MRC147594/resources/grch37/human_g1k_v37.fasta \ -md output_metadata_directory \ -profilesDir profiles_10000 \ -runDirectory lcnv_output \ -maxDepth 60 \ -genomeMaskFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.svmask.fasta \ -ploidyMapFile ~/Research_Project-MRC147594/resources/hg19/Homo_sapiens_assembly19/Homo_sapiens_assembly19.ploidymap.txt \ -genderMapFile output_metadata_directory/sample_gender.report.txt \ -jobLogDir lcnv_output/logDir \ -jobRunner ParallelShell \ -maxConcurrentRun 16 \ -run Any assistance would be greatly appreciated. Matthew |