I am having problems with the new bin optimization module of the new version. It finishes 1 iteration and then it gets stuck. I tried the same dataset in V3.3 and the optimization module did not get stuck.
The data was assembled with MegaHit and the coverage is given by the map reads module.
Thanks
Danny
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Danny,
If you can make your project data available to me, I will check what
goes wrong and provide a fix. Of course data will remain totally
confidential...
Am currently away from campus so will not be able to proceed until early
Aug.
Best!
Marc
On 07/10/15 06:18, Danny Ionescu wrote:
Hi Marc
I am having problems with the new bin optimization module of the new
version. It finishes 1 iteration and then it gets stuck. I tried the
same dataset in V3.3 and the optimization module did not get stuck.
The data was assembled with MegaHit and the coverage is given by the
map reads module.
--
Marc Strous
CAIP Research Chair in Microbiology
Energy Bioengineering Group
Department of Geoscience
University of Calgary
www.ucalgary.ca/ebg
el: (403) 220 6604
EEEL-509
To begin with, many thanks for the amazing pipeline (second time I am using it... hopefully the first time will bare some fruits soon). Since my issue is related to the bin optimization process I though I should post it here instead of opening a new post.
In the current run I am processing a relatively large dataset (assembly: 5,157,130 contigs, max length 198,509, N50 of 779, total assembly length about 360Mbp... sequences were derived from 30 samples of the same environment allocating about 4 M 250bp paired-end illumina HiSeq 2500 rapid run reads per sample... the sequences were quality treamed prior assembly and also were used as input for the Metawatt run). By the way my previous run did not have replication or the magnitude of the current run... I therefore did not experience any issues.
-My comuter specs are: 64 GB RAM (plus 64 GB swap), 24 Xeon cores, Ubuntu 14.04 LTS
-I am running Metawatt 3.5.2 with 34 GB RAM at the moment (I reduced from 54 due to mapper related errors).
The optimization step seems to be proceeding at the described processing rates in the module link you provide until the 9% of the run is completed and then it appears to be freezing for days (6 days in the last attempt until I stopped it). To be precise though, there are processes of java modules running according to the htop software during the wait, and there are no errors apearing (except for the last time where I go an 'Exception in thread "AWT-EventQueue-0" java.util.ConcurrentModificationException...' but I was going through the bins while running the software and I am not certain if this lead to the error) or registering in the log (attached). I have checked also for disk space issues that could be related to the swap and there is about 100 GB of free space.
Given the above dataset size and specifications, is this delay expected or should I try something different?
Thank you for trying metawatt and reporting this problem. I am aware
that this module often malfunctions for larger datasets. Furthermore, I
am currently unsatisfied with the performance of this module, which
implements a series of heuristic tests and fixes to bins. I intend to
fix the problem the next release of metawatt which is planned for later
this year, time permitting.
So, to make a long story short, at this moment I would strongly suggest
to simply disable the optimization module... and run your pipeline
without it.
Best wishes,
Marc
On 06/08/2016 08:18 PM, Sotirios Vasileiadis wrote:
Hi Marc,
To begin with, many thanks for the amazing pipeline (second time I am
using it... hopefully the first time will bare some fruits soon).
Since my issue is related to the bin optimization process I though I
should post it here instead of opening a new post.
In the current run I am processing a relatively large dataset
(assembly: 5,157,130 contigs, max length 198,509, N50 of 779, total
assembly length about 360Mbp... sequences were derived from 30 samples
of the same environment allocating about 4 M 250bp paired-end illumina
HiSeq 2500 rapid run reads per sample... the sequences were quality
treamed prior assembly and also were used as input for the Metawatt
run). By the way my previous run did not have replication or the
magnitude of the current run... I therefore did not experience any issues.
-My comuter specs are: 64 GB RAM (plus 64 GB swap), 24 Xeon cores,
Ubuntu 14.04 LTS
-I am running Metawatt 3.5.2 with 34 GB RAM at the moment (I reduced
from 54 due to mapper related errors).
The optimization step seems to be proceeding at the described
processing rates in the module link you provide until the 9% of the
run is completed and then it appears to be freezing for days (6 days
in the last attempt until I stopped it). To be precise though, there
are processes of java modules running according to the htop software
during the wait, and there are no errors apearing (except for the last
time where I go an 'Exception in thread "AWT-EventQueue-0"
java.util.ConcurrentModificationException...' but I was going through
the bins while running the software and I am not certain if this lead
to the error) or registering in the log (attached)
Given the above dataset size and specifications, is this delay
expected or should I try something different?
--
Marc Strous
CAIP Research Chair in Microbiology
Energy Bioengineering Group
Department of Geoscience
University of Calgary
www.ucalgary.ca/ebg
tel: (403) 220 6604
2500 University Drive NW
Calgary, AB, Canada T2N 1N4
Energy Environment and Experiental Learning (EEEL) Building
Room 509
Hi Marc
I am having problems with the new bin optimization module of the new version. It finishes 1 iteration and then it gets stuck. I tried the same dataset in V3.3 and the optimization module did not get stuck.
The data was assembled with MegaHit and the coverage is given by the map reads module.
Thanks
Danny
Hi Danny,
If you can make your project data available to me, I will check what
goes wrong and provide a fix. Of course data will remain totally
confidential...
Am currently away from campus so will not be able to proceed until early
Aug.
Best!
Marc
On 07/10/15 06:18, Danny Ionescu wrote:
--
Marc Strous
CAIP Research Chair in Microbiology
Energy Bioengineering Group
Department of Geoscience
University of Calgary
www.ucalgary.ca/ebg
el: (403) 220 6604
EEEL-509
Hi Marc,
To begin with, many thanks for the amazing pipeline (second time I am using it... hopefully the first time will bare some fruits soon). Since my issue is related to the bin optimization process I though I should post it here instead of opening a new post.
In the current run I am processing a relatively large dataset (assembly: 5,157,130 contigs, max length 198,509, N50 of 779, total assembly length about 360Mbp... sequences were derived from 30 samples of the same environment allocating about 4 M 250bp paired-end illumina HiSeq 2500 rapid run reads per sample... the sequences were quality treamed prior assembly and also were used as input for the Metawatt run). By the way my previous run did not have replication or the magnitude of the current run... I therefore did not experience any issues.
-My comuter specs are: 64 GB RAM (plus 64 GB swap), 24 Xeon cores, Ubuntu 14.04 LTS
-I am running Metawatt 3.5.2 with 34 GB RAM at the moment (I reduced from 54 due to mapper related errors).
The optimization step seems to be proceeding at the described processing rates in the module link you provide until the 9% of the run is completed and then it appears to be freezing for days (6 days in the last attempt until I stopped it). To be precise though, there are processes of java modules running according to the htop software during the wait, and there are no errors apearing (except for the last time where I go an 'Exception in thread "AWT-EventQueue-0" java.util.ConcurrentModificationException...' but I was going through the bins while running the software and I am not certain if this lead to the error) or registering in the log (attached). I have checked also for disk space issues that could be related to the swap and there is about 100 GB of free space.
Given the above dataset size and specifications, is this delay expected or should I try something different?
Thanks in advance for any associated information.
Best wishes,
Sotirios
Last edit: Sotirios Vasileiadis 2016-06-09
Hi Sotirios,
Thank you for trying metawatt and reporting this problem. I am aware
that this module often malfunctions for larger datasets. Furthermore, I
am currently unsatisfied with the performance of this module, which
implements a series of heuristic tests and fixes to bins. I intend to
fix the problem the next release of metawatt which is planned for later
this year, time permitting.
So, to make a long story short, at this moment I would strongly suggest
to simply disable the optimization module... and run your pipeline
without it.
Best wishes,
Marc
On 06/08/2016 08:18 PM, Sotirios Vasileiadis wrote:
--
Marc Strous
CAIP Research Chair in Microbiology
Energy Bioengineering Group
Department of Geoscience
University of Calgary
www.ucalgary.ca/ebg
tel: (403) 220 6604
2500 University Drive NW
Calgary, AB, Canada T2N 1N4
Energy Environment and Experiental Learning (EEEL) Building
Room 509
Hi Marc,
I will follow your advice and get back to you in case of any issues... looking forward to the next version.
Best wishes,
Sotirios