Hi Michael,
I have a metassembly runt that takes already 4 weeks merging a Spades and a Celera fish genome assembly (1GB genome size maximum). Telling from you rpaper this should have been finishes long time ago,
Here is a copy of my spec file [global]
Can you tell what phase of the program is currently running? We
successfully merged the fish genome from the Assemblathon 2 data set in ~1
day. Here are the notes on it from the supplemental material:
For all Fish assemblies and metassemblies we used the available 2Kb
mate-pair libraries:
801KYABXX.2 and 801KYABXX.3
Hi Michael,
I have a metassembly runt that takes already 4 weeks merging a Spades and
a Celera fish genome assembly (1GB genome size maximum). Telling from you
rpaper this should have been finishes long time ago,
Here is a copy of my spec file [global]
Hi Michael
Yes that is what I saw in your paper and it is a species closely related
to the one from the Assemblathon. I was guessing that the issue is
nucmer? I realized that I am using a rather old version of Mummer, maybe
that is the problem that it goes so slow?
telling from the logs, it is stuck at this step (though doing something
since the file QSpadesFemales.CeleraFemales.mgaps keeps changing)
---- Merging SpadesFemales and CeleraFemales ==>
QSpadesFemales.CeleraFemales
This is what goes to stderr
Processed 56659 scaffolds and 117924 contigs, printed 113096 at least
200 bp long
Processed 908390 scaffolds and 913280 contigs, printed 404117 at least
200 bp long
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales/QSpadesFemales.CeleraFemales.ntref"
of length 729766782
construct suffix tree for sequence of length 729766782
Can you tell what phase of the program is currently running? We
successfully merged the fish genome from the Assemblathon 2 data set in ~1
day. Here are the notes on it from the supplemental material:
For all Fish assemblies and metassemblies we used the available 2Kb
mate-pair libraries:
801KYABXX.2 and 801KYABXX.3
Hi Michael,
I have a metassembly runt that takes already 4 weeks merging a
Spades and
a Celera fish genome assembly (1GB genome size maximum). Telling
from you
rpaper this should have been finishes long time ago,
Here is a copy of my spec file
[global]
Mate-pair mapping parameters:
bowtie2_threads=8
bowtie2_read1=all_1P.fastq
bowtie2_read2=all_2P.fastq
bowtie2_maxins=1000
bowtie2_minins=10
genomeLength=950000000
meta2fasta_keepUnaligned=3
meta2fasta_sizeUnaligned=350 350
nucmer_l=50
nucmer_c=300
CE-stat computation parameters:
mateAn_s=500
mateAn_m=350
[1]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/
CeleraFemales/assembly/9-terminator/all_females.scf.fasta
ID=CeleraFemales
mateAn_file=
[2]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/
SpadesFemale/spadesnewmemory/scaffolds.fasta
ID=SpadesFemales
mateAn_file=
I am running on 8 cores and 40GB RAM, any help would be great
All the best
Astrid
------------------------------------------------------------------------
extremely long runtime? configuration worng?
<https://sourceforge.%0Anet/p/metassembler/discussion/general/thread/20e2dac4/?limit=25#3afa>
------------------------------------------------------------------------
Sent from sourceforge.net because you indicated interest in <
https://sourceforge.net/p/metassembler/discussion/general/>
To unsubscribe from further messages, please visit <
https://sourceforge.net/auth/subscriptions/>
Yeah, there must be tons of repeats if it is still stuck in nucmer. As
painful as it is, Id kill the job and start again with different nucmer
settings. I would recommend: -l 100 -c 500
This will (modestly) reduce sensitivity, but could finish in less than a
day. If it takes more than a day, boost up -l 100 to -l 250 and try again
Hi Michael
Yes that is what I saw in your paper and it is a species closely related
to the one from the Assemblathon. I was guessing that the issue is
nucmer? I realized that I am using a rather old version of Mummer, maybe
that is the problem that it goes so slow?
telling from the logs, it is stuck at this step (though doing something
since the file QSpadesFemales.CeleraFemales.mgaps keeps changing)
---- Merging SpadesFemales and CeleraFemales ==>
QSpadesFemales.CeleraFemales
This is what goes to stderr
Processed 56659 scaffolds and 117924 contigs, printed 113096 at least
200 bp long
Processed 908390 scaffolds and 913280 contigs, printed 404117 at least
200 bp long
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_
Female/Metassembly/QSpadesFemales.CeleraFemales/
QSpadesFemales.CeleraFemales.ntref"
of length 729766782
construct suffix tree for sequence of length 729766782
Can you tell what phase of the program is currently running? We
successfully merged the fish genome from the Assemblathon 2 data set in
~1
day. Here are the notes on it from the supplemental material:
For all Fish assemblies and metassemblies we used the available 2Kb
mate-pair libraries:
801KYABXX.2 and 801KYABXX.3
Hi Michael,
I have a metassembly runt that takes already 4 weeks merging a
Spades and
a Celera fish genome assembly (1GB genome size maximum). Telling
from you
rpaper this should have been finishes long time ago,
Here is a copy of my spec file
[global]
Mate-pair mapping parameters:
bowtie2_threads=8
bowtie2_read1=all_1P.fastq
bowtie2_read2=all_2P.fastq
bowtie2_maxins=1000
bowtie2_minins=10
genomeLength=950000000
meta2fasta_keepUnaligned=3
meta2fasta_sizeUnaligned=350 350
nucmer_l=50
nucmer_c=300
CE-stat computation parameters:
mateAn_s=500
mateAn_m=350
[1]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/
CeleraFemales/assembly/9-terminator/all_females.scf.fasta
ID=CeleraFemales
mateAn_file=
[2]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/
SpadesFemale/spadesnewmemory/scaffolds.fasta
ID=SpadesFemales
mateAn_file=
I am running on 8 cores and 40GB RAM, any help would be great
All the best
Astrid
------------------------------------------------------------
extremely long runtime? configuration worng?
<https://sourceforge.%0Anet/p/metassembler/discussion/
Sent from sourceforge.net because you indicated interest in <
https://sourceforge.net/p/metassembler/discussion/general/>
To unsubscribe from further messages, please visit <
https://sourceforge.net/auth/subscriptions/>
Hi Michael,
I have a metassembly runt that takes already 4 weeks merging a Spades and a Celera fish genome assembly (1GB genome size maximum). Telling from you rpaper this should have been finishes long time ago,
Here is a copy of my spec file
[global]
Mate-pair mapping parameters:
bowtie2_threads=8
bowtie2_read1=all_1P.fastq
bowtie2_read2=all_2P.fastq
bowtie2_maxins=1000
bowtie2_minins=10
genomeLength=950000000
meta2fasta_keepUnaligned=3
meta2fasta_sizeUnaligned=350 350
nucmer_l=50
nucmer_c=300
CE-stat computation parameters:
mateAn_s=500
mateAn_m=350
[1]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/CeleraFemales/assembly/9-terminator/all_females.scf.fasta
ID=CeleraFemales
mateAn_file=
[2]
fasta=/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/SpadesFemale/spadesnewmemory/scaffolds.fasta
ID=SpadesFemales
mateAn_file=
I am running on 8 cores and 40GB RAM, any help would be great
All the best
Astrid
Can you tell what phase of the program is currently running? We
successfully merged the fish genome from the Assemblathon 2 data set in ~1
day. Here are the notes on it from the supplemental material:
For all Fish assemblies and metassemblies we used the available 2Kb
mate-pair libraries:
801KYABXX.2 and 801KYABXX.3
Mapping: bowtie2 --maxins 3000 --minins 1000 --threads 16
CE-statistic: mateAn -A 1500 -B 2600
WGA: nucmer –maxmatch -l 50 -c 300
Merges: asseMerge with default options
Runtime Requirements:
Bowtie alignment: ~6.2 h
CEstat computation: ~2.6 h
Nucmer WGA: ~57 h
asseMerge: ~45 min
meta2fasta: ~70 s
Peak RAM requirement: 36GB
Depending on what step is running i can make some suggestions on what could
be tuned
Hope this helps
Mike
On Wed, May 17, 2017 at 7:06 AM, Astrid astridboehne@users.sf.net wrote:
Hi Michael
Yes that is what I saw in your paper and it is a species closely related
to the one from the Assemblathon. I was guessing that the issue is
nucmer? I realized that I am using a rather old version of Mummer, maybe
that is the problem that it goes so slow?
telling from the logs, it is stuck at this step (though doing something
since the file QSpadesFemales.CeleraFemales.mgaps keeps changing)
---- Merging SpadesFemales and CeleraFemales ==>
QSpadesFemales.CeleraFemales
---------- Run bash command ----------
Create
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales:
mkdir
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales
...
---------- Run bash command ----------
nucmer
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/CeleraFemales/CeleraFemales.fa
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/SpadesFemales/SpadesFemales.fa:
nucmer --maxmatch -l 50 -c 300 -p
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales/QSpadesFemales.CeleraFemales
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/CeleraFemales/CeleraFemales.fa
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/SpadesFemales/SpadesFemales.fa
...
This is what goes to stderr
Processed 56659 scaffolds and 117924 contigs, printed 113096 at least
200 bp long
Processed 908390 scaffolds and 913280 contigs, printed 404117 at least
200 bp long
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales/QSpadesFemales.CeleraFemales.ntref"
of length 729766782
construct suffix tree for sequence of length 729766782
(maximum reference length is 2305843009213693948)
(maximum query length is 18446744073709551615)
process 7297667 characters per dot
....................................................................................................
CONSTRUCTIONTIME
/scicore/home/salzburg/boehne/applications/MUMmer3.23/mummer
/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales/QSpadesFemales.CeleraFemales.ntref
335.34
reading input file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/SpadesFemales/SpadesFemales.fa"
of length 903891688
matching query-file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/SpadesFemales/SpadesFemales.fa"
against subject-file
"/scicore/home/salzburg/boehne/Ambizione/Pseudocrenilabrus/Metassemble_Female/Metassembly/QSpadesFemales.CeleraFemales/QSpadesFemales.CeleraFemales.ntref"
Thank you for your quick reply
Astrid
On 17.05.17 16:37, Michael Schatz wrote:
--
Astrid Böhne
Universität Basel
Zoologisches Institut
Evolutionsbiologie
Vesalgasse 1
CH-4051 Basel
Switzerland
Phone +41 (0)61 207 03 05
Fax +41 (0) 61 207 03 01
Yeah, there must be tons of repeats if it is still stuck in nucmer. As
painful as it is, Id kill the job and start again with different nucmer
settings. I would recommend: -l 100 -c 500
This will (modestly) reduce sensitivity, but could finish in less than a
day. If it takes more than a day, boost up -l 100 to -l 250 and try again
Good luck!
Mike
On Wed, May 17, 2017 at 11:02 AM, Astrid astridboehne@users.sf.net wrote: