Hello,
I am trying to correct PacBio reads with Illumina reads using
the command:
PBcR -length 500 -partitions 200 -l onc -s pacbio.SGE.spec -fastq \ filtered_subreads.fastq genomeSize=1200000000 illumina180.frg illumina500.frg \ illumina3000.frg illumina10000.frg
but get an
"meryl failed" message:
...
Segment 5132 finished.
Thread exits.
Threads all done, cleaning up.
Merge results.
bitPackedFile::bitPackedFile()-- failed to open '/opt/gc/project_data/nonLIMS_projects/BioInfo/hybrid_assembly_test/onc_test/temponc/0-mercounts/asm-C-ms14-cm0.batch510.mcdat': Too many open files
Can fit 22937600 mers into table with prefix of 20 bits, using 25.000MB ( 0.000MB for positions)
Failure message:
meryl failed
The pacbio.SGE.spec looks like this:
merSize=14
useGrid = 1
scriptOnGrid = 1
frgCorrOnGrid = 1
ovlCorrOnGrid = 1
sge = -A assembly
sgeScript = -pe smp 16
sgeConsensus = -pe smp 1
sgeOverlap = -pe smp 16
sgeFragmentCorrection = -pe smp 2
sgeOverlapCorrection = -pe smp 1
ovlThreads = 16
Any ideas what I could change?
Astacus
Set merylMemory (in MB) to something much bigger. It looks like it auto-detected 32 cores, and used the default 800mb to then run chunks of size 800/32=25mb.
Based on the report of 5132 segments, and that you failed at segment 510, the minimum size that will work will be slightly more than 800*10 = 8gb. At least 12000 is suggested.
Since you're asking for 'pe smp 16' you probably also want to set merylThreads=16.
http://wgs-assembler.sourceforge.net/wiki/index.php/RunCA#Meryl
Thanks for the help. Now the job seems to start but stops at the
next step:
qsub -A assembly -pe smp 16 -cwd -N "pBcR_asm" -hold_jid "pBcR_ovlprep_asm" -j y -o /opt/gc/project_data/nonLIMS_projects/BioInfo/hybrid_assembly_test/onc_test//temponc2/runPBcR.sge.out.00 /opt/gc/project_data/nonLIMS_projects/BioInfo/hybrid_assembly_test/onc_test//temponc2/runPBcR.sge.out.00.sh
Your job 104714 ("pBcR_asm") has been submitted
In the ouput file "runPBcR.sge.out.00 I see:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
if: Expression Syntax.
then: Command not found.
sigh
astacus
This looks like the job is not running under the bash shell and so the script is not being interpreted properly. Your runPBcR.sge.out.00.sh should start with
#!/bin/bash
Does it, does /bin/bash exist on your system? Can you run the script by hand without error (i.e. if you just do sh runPBcR.sge.out.00.sh)
[Bri trimmed quoted email]
Last edit: Brian Walenz 2015-02-11
Yup, exactly that. SGE is, by default, using the csh shell to run scripts.
Add qsub option "-b y" to the runCA 'sge' line. This will tell sge to treat the command as a binary, instead of a script, and in this case, it will then respect the '#!' interpreter.
"-S /bin/bash" should also work, but I haven't tried it.