When running the sample data on an HPC cluster, the assembly fails. The /MergePipline folder is created with directories /A.CEstat and /B.CEstat, each of those contain /BWTaln directories containing files (which look to be appropriate size), but the /A.ce or /B.ce directories are empty. /BWTaln/A.mtp.2k.err (and B..err) show 92-93% alignment rate. /MergePipeline/M1 directory is empty. I've written the stdout and sterr to files, which I'll paste below. The main error seems to be that metassembler can't locate bin/mateAn, bin/Ncoords, bin/asseMerge or bin/meta2fasta. This is odd, because my $PATH variable contains the entire path to metassembler/bin. Unsure what else this could be. Any suggestions? Thank you.
StdErr:
Building a SMALL index
/var/spool/slurmd/job1039615/slurm_script: line 38: ../../bin/mateAn: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 40: ../../bin/Ncoords: No such file or directory
Building a SMALL index
/var/spool/slurmd/job1039615/slurm_script: line 50: ../../bin/mateAn: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 52: ../../bin/Ncoords: No such file or directory
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
# reading input file "MergePipeline/B.A.ntref" of length 245001
# construct suffix tree for sequence of length 245001
# (maximum reference length is 536870908)
# (maximum query length is 4294967295)
# process 2450 characters per dot
#....................................................................................................
# CONSTRUCTIONTIME /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.02
# reading input file "/home/cmb02010/Dys/metassembler/test/B.fa" of length 245000
# matching query-file "/home/cmb02010/Dys/metassembler/test/B.fa"
# against subject-file "MergePipeline/B.A.ntref"
# COMPLETETIME /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.08
# SPACE /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.47
4: FINISHING DATA
/var/spool/slurmd/job1039615/slurm_script: line 62: ../../bin/asseMerge: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 66: ../../bin/meta2fasta: No such file or directory
Yes, the problem is the binary files can't be found. This should be fixed
if you run the script from within the sample/meta1 directory, or if you
modify the Step_by_Step_script.sh and/or the Metassemble_script.sh scripts
by simply changing any line
of the form:
../../bin/BinFile
to:
BinFile.
The latter can be done with the following bash commands:
sed 's#^../../bin/##' Step_by_Step_script.sh > Step_by_Step_script.sh
and:
sed 's#^../../bin/##' Metassemble_script.sh > Metassemble_script.sh
Wences
On Mon, Jun 19, 2017 at 11:50 AM, Charles Bridges cmb12@users.sf.net
wrote:
Hello,
When running the sample data on an HPC cluster, the assembly fails. The
/MergePipline folder is created with directories /A.CEstat and /B.CEstat,
each of those contain /BWTaln directories containing files (which look to
be appropriate size), but the /A.ce or /B.ce directories are empty.
/BWTaln/A.mtp.2k.err (and B..err) show 92-93% alignment rate.
/MergePipeline/M1 directory is empty. I've written the stdout and sterr to
files, which I'll paste below. The main error seems to be that metassembler
can't locate bin/mateAn, bin/Ncoords, bin/asseMerge or bin/meta2fasta. This
is odd, because my $PATH variable contains the entire path to
metassembler/bin. Unsure what else this could be. Any suggestions? Thank
you.
StdErr:
Building a SMALL index
/var/spool/slurmd/job1039615/slurm_script: line 38: ../../bin/mateAn: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 40: ../../bin/Ncoords: No such file or directory
Building a SMALL index
/var/spool/slurmd/job1039615/slurm_script: line 50: ../../bin/mateAn: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 52: ../../bin/Ncoords: No such file or directory
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file "MergePipeline/B.A.ntref" of length 245001
construct suffix tree for sequence of length 245001
SPACE /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.47
4: FINISHING DATA
/var/spool/slurmd/job1039615/slurm_script: line 62: ../../bin/asseMerge: No such file or directory
/var/spool/slurmd/job1039615/slurm_script: line 66: ../../bin/meta2fasta: No such file or directory
StdOut:
Settings:
Output files: "MergePipeline/A.CEstat/BWTaln/A..bt2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void:8, int:4, long:8, size_t:8Input files DNA, FASTA:
A.faReading reference sizes
Time reading reference sizes: 00:00:00Calculating joined lengthWriting headerReserving space for joined stringJoining reference sequences
Time to join reference sequences: 00:00:00bmax according to bmaxDivN setting: 61250Using parameters --bmax 45938 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 45938 --dcv 1024Constructing suffix-array element generatorBuilding DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:00
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:00
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:00
Sanity-checking and returningBuilding samplesReserving space for 12 sample suffixesGenerating random suffixesQSorting 12 sample offsets, eliminating duplicatesQSorting sample offsets, eliminating duplicates time: 00:00:00Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00Calculating bucket sizesSplitting and merging
Splitting and merging time: 00:00:00Avg bucket size: 245000 (target: 45937)Converting suffix-array elements to index imageAllocating ftab, absorbFtabEntering Ebwt loopGetting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 245000 for bucket 1
(Using difference cover)
Sorting block time: 00:00:00Returning block of 245001 for bucket 1Exited Ebwt loopfchr[A]: 0fchr[C]: 84901fchr[G]: 121213fchr[T]: 165866fchr[$]: 245000Exiting Ebwt::buildToDisk()Returning from initFromVectorWrote 4276195 bytes to primary EBWT file: MergePipeline/A.CEstat/BWTaln/A.1.bt2Wrote 61256 bytes to secondary EBWT file: MergePipeline/A.CEstat/BWTaln/A.2.bt2Re-opening _in1 and _in2 as input streamsReturning from Ebwt constructorHeaders:
len: 245000
bwtLen: 245001
sz: 61250
bwtSz: 61251
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 15313
offsSz: 61252
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 1277
numLines: 1277
ebwtTotLen: 81728
ebwtTotSz: 81728
color: 0
reverse: 0Total time for call to driver() for forward index: 00:00:00Reading reference sizes
Time reading reference sizes: 00:00:00Calculating joined lengthWriting headerReserving space for joined stringJoining reference sequences
Time to join reference sequences: 00:00:00
Time to reverse reference sequence: 00:00:00bmax according to bmaxDivN setting: 61250Using parameters --bmax 45938 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 45938 --dcv 1024Constructing suffix-array element generatorBuilding DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:00
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:00
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:00
Sanity-checking and returningBuilding samplesReserving space for 12 sample suffixesGenerating random suffixesQSorting 12 sample offsets, eliminating duplicatesQSorting sample offsets, eliminating duplicates time: 00:00:00Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00Calculating bucket sizesSplitting and merging
Splitting and merging time: 00:00:00Avg bucket size: 245000 (target: 45937)Converting suffix-array elements to index imageAllocating ftab, absorbFtabEntering Ebwt loopGetting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 245000 for bucket 1
(Using difference cover)
Sorting block time: 00:00:00Returning block of 245001 for bucket 1Exited Ebwt loopfchr[A]: 0fchr[C]: 84901fchr[G]: 121213fchr[T]: 165866fchr[$]: 245000Exiting Ebwt::buildToDisk()Returning from initFromVectorWrote 4276195 bytes to primary EBWT file: MergePipeline/A.CEstat/BWTaln/A.rev.1.bt2Wrote 61256 bytes to secondary EBWT file: MergePipeline/A.CEstat/BWTaln/A.rev.2.bt2Re-opening _in1 and _in2 as input streamsReturning from Ebwt constructorHeaders:
len: 245000
bwtLen: 245001
sz: 61250
bwtSz: 61251
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 15313
offsSz: 61252
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 1277
numLines: 1277
ebwtTotLen: 81728
ebwtTotSz: 81728
color: 0
reverse: 1Total time for backward call to driver() for mirror index: 00:00:00Settings:
Output files: "MergePipeline/B.CEstat/BWTaln/B..bt2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void:8, int:4, long:8, size_t:8Input files DNA, FASTA:
B.faReading reference sizes
Time reading reference sizes: 00:00:00Calculating joined lengthWriting headerReserving space for joined stringJoining reference sequences
Time to join reference sequences: 00:00:00bmax according to bmaxDivN setting: 61250Using parameters --bmax 45938 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 45938 --dcv 1024Constructing suffix-array element generatorBuilding DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:00
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:00
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:00
Sanity-checking and returningBuilding samplesReserving space for 12 sample suffixesGenerating random suffixesQSorting 12 sample offsets, eliminating duplicatesQSorting sample offsets, eliminating duplicates time: 00:00:00Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00Calculating bucket sizesSplitting and merging
Splitting and merging time: 00:00:00Avg bucket size: 245000 (target: 45937)Converting suffix-array elements to index imageAllocating ftab, absorbFtabEntering Ebwt loopGetting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 245000 for bucket 1
(Using difference cover)
Sorting block time: 00:00:00Returning block of 245001 for bucket 1Exited Ebwt loopfchr[A]: 0fchr[C]: 84658fchr[G]: 120878fchr[T]: 165510fchr[$]: 245000Exiting Ebwt::buildToDisk()Returning from initFromVectorWrote 4276195 bytes to primary EBWT file: MergePipeline/B.CEstat/BWTaln/B.1.bt2Wrote 61256 bytes to secondary EBWT file: MergePipeline/B.CEstat/BWTaln/B.2.bt2Re-opening _in1 and _in2 as input streamsReturning from Ebwt constructorHeaders:
len: 245000
bwtLen: 245001
sz: 61250
bwtSz: 61251
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 15313
offsSz: 61252
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 1277
numLines: 1277
ebwtTotLen: 81728
ebwtTotSz: 81728
color: 0
reverse: 0Total time for call to driver() for forward index: 00:00:00Reading reference sizes
Time reading reference sizes: 00:00:00Calculating joined lengthWriting headerReserving space for joined stringJoining reference sequences
Time to join reference sequences: 00:00:00
Time to reverse reference sequence: 00:00:00bmax according to bmaxDivN setting: 61250Using parameters --bmax 45938 --dcv 1024
Doing ahead-of-time memory usage test
Passed! Constructing with these parameters: --bmax 45938 --dcv 1024Constructing suffix-array element generatorBuilding DifferenceCoverSample
Building sPrime
Building sPrimeOrder
V-Sorting samples
V-Sorting samples time: 00:00:00
Allocating rank array
Ranking v-sort output
Ranking v-sort output time: 00:00:00
Invoking Larsson-Sadakane on ranks
Invoking Larsson-Sadakane on ranks time: 00:00:00
Sanity-checking and returningBuilding samplesReserving space for 12 sample suffixesGenerating random suffixesQSorting 12 sample offsets, eliminating duplicatesQSorting sample offsets, eliminating duplicates time: 00:00:00Multikey QSorting 12 samples
(Using difference cover)
Multikey QSorting samples time: 00:00:00Calculating bucket sizesSplitting and merging
Splitting and merging time: 00:00:00Avg bucket size: 245000 (target: 45937)Converting suffix-array elements to index imageAllocating ftab, absorbFtabEntering Ebwt loopGetting block 1 of 1
No samples; assembling all-inclusive block
Sorting block of length 245000 for bucket 1
(Using difference cover)
Sorting block time: 00:00:00Returning block of 245001 for bucket 1Exited Ebwt loopfchr[A]: 0fchr[C]: 84658fchr[G]: 120878fchr[T]: 165510fchr[$]: 245000Exiting Ebwt::buildToDisk()Returning from initFromVectorWrote 4276195 bytes to primary EBWT file: MergePipeline/B.CEstat/BWTaln/B.rev.1.bt2Wrote 61256 bytes to secondary EBWT file: MergePipeline/B.CEstat/BWTaln/B.rev.2.bt2Re-opening _in1 and _in2 as input streamsReturning from Ebwt constructorHeaders:
len: 245000
bwtLen: 245001
sz: 61250
bwtSz: 61251
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 15313
offsSz: 61252
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 1277
numLines: 1277
ebwtTotLen: 81728
ebwtTotSz: 81728
color: 0
reverse: 1Total time for backward call to driver() for mirror index: 00:00:00
Thank you Wences. I've replaced the directory paths appropriately, and most of the program is running as expected. Now, I'm not sure if the program has run to completion. Based on the description of directories that should be created found in the Manual, I believe the program is hung up after the asseMerge or meta2fasta. I have directory /M1/, which contains 19 or so files including B.A.fasta, but around 8 of those files are 0 bytes. I'm also lacking the M1/Metassembly/ directory, and my stdout file is lacking information at the end of the file. Is this the final output that I should expect, or is the program hung up? Below is stderr and stdout:
Stderr:
Building a SMALL index
Building a SMALL index
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
# reading input file "MergePipeline/B.A.ntref" of length 245001
# construct suffix tree for sequence of length 245001
# (maximum reference length is 536870908)
# (maximum query length is 4294967295)
# process 2450 characters per dot
#....................................................................................................
# CONSTRUCTIONTIME /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.02
# reading input file "/home/cmb02010/Dys/metassembler/test/B.fa" of length 245000
# matching query-file "/home/cmb02010/Dys/metassembler/test/B.fa"
# against subject-file "MergePipeline/B.A.ntref"
# COMPLETETIME /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.08
# SPACE /apps2/mummer/3.23/bin/mummer MergePipeline/B.A.ntref 0.47
4: FINISHING DATA
Yes the output is expected since you are running the Step_by_Step.sh
script, this is a bash script that performs a metassembly by calling each
of the programs required in each step of the process (for example by
calling bowtie, nucmer, etc). On the other hand, the manual refers to the
wrapper "metassemble" which is a python script that does the same
(performing a whole metassembly) but for any set of input parameters, which
are specified through a config file. This wrapper does create all the
directory structure described in the manual. To test the latter please run
the Metassemble_script.sh in the sample directory. This script creates a
configuration file "B.A.metassemble.config" and then calls metassemble to
perform the metassembly. In general I encourage you to use the metassemble
wrapper.
Wences
On Tue, Jun 20, 2017 at 9:46 AM, Charles Bridges cmb12@users.sf.net wrote:
Thank you Wences. I've replaced the directory paths appropriately, and
most of the program is running as expected. Now, I'm not sure if the
program has run to completion. Based on the description of directories that
should be created found in the Manual, I believe the program is hung up
after the asseMerge or meta2fasta. I have directory /M1/, which contains 19
or so files including B.A.fasta, but around 8 of those files are 0 bytes.
I'm also lacking the M1/Metassembly/ directory, and my stdout file is
lacking information at the end of the file. Is this the final output that I
should expect, or is the program hung up? Below is stderr and stdout:
Stderr:
Building a SMALL index
Building a SMALL index
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file "MergePipeline/B.A.ntref" of length 245001
construct suffix tree for sequence of length 245001
Hello,
When running the sample data on an HPC cluster, the assembly fails. The /MergePipline folder is created with directories /A.CEstat and /B.CEstat, each of those contain /BWTaln directories containing files (which look to be appropriate size), but the /A.ce or /B.ce directories are empty. /BWTaln/A.mtp.2k.err (and B..err) show 92-93% alignment rate. /MergePipeline/M1 directory is empty. I've written the stdout and sterr to files, which I'll paste below. The main error seems to be that metassembler can't locate bin/mateAn, bin/Ncoords, bin/asseMerge or bin/meta2fasta. This is odd, because my $PATH variable contains the entire path to metassembler/bin. Unsure what else this could be. Any suggestions? Thank you.
StdErr:
StdOut:
Yes, the problem is the binary files can't be found. This should be fixed
if you run the script from within the sample/meta1 directory, or if you
modify the Step_by_Step_script.sh and/or the Metassemble_script.sh scripts
by simply changing any line
of the form:
../../bin/BinFile
to:
BinFile.
The latter can be done with the following bash commands:
sed 's#^../../bin/##' Step_by_Step_script.sh > Step_by_Step_script.sh
and:
sed 's#^../../bin/##' Metassemble_script.sh > Metassemble_script.sh
Wences
On Mon, Jun 19, 2017 at 11:50 AM, Charles Bridges cmb12@users.sf.net
wrote:
Thank you Wences. I've replaced the directory paths appropriately, and most of the program is running as expected. Now, I'm not sure if the program has run to completion. Based on the description of directories that should be created found in the Manual, I believe the program is hung up after the asseMerge or meta2fasta. I have directory /M1/, which contains 19 or so files including B.A.fasta, but around 8 of those files are 0 bytes. I'm also lacking the M1/Metassembly/ directory, and my stdout file is lacking information at the end of the file. Is this the final output that I should expect, or is the program hung up? Below is stderr and stdout:
Stderr:
Yes the output is expected since you are running the Step_by_Step.sh
script, this is a bash script that performs a metassembly by calling each
of the programs required in each step of the process (for example by
calling bowtie, nucmer, etc). On the other hand, the manual refers to the
wrapper "metassemble" which is a python script that does the same
(performing a whole metassembly) but for any set of input parameters, which
are specified through a config file. This wrapper does create all the
directory structure described in the manual. To test the latter please run
the Metassemble_script.sh in the sample directory. This script creates a
configuration file "B.A.metassemble.config" and then calls metassemble to
perform the metassembly. In general I encourage you to use the metassemble
wrapper.
Wences
On Tue, Jun 20, 2017 at 9:46 AM, Charles Bridges cmb12@users.sf.net wrote: