Menu

Problem with bin optimization in V3.5

2015-07-10
2016-06-10
  • Danny Ionescu

    Danny Ionescu - 2015-07-10

    Hi Marc

    I am having problems with the new bin optimization module of the new version. It finishes 1 iteration and then it gets stuck. I tried the same dataset in V3.3 and the optimization module did not get stuck.

    The data was assembled with MegaHit and the coverage is given by the map reads module.

    Thanks

    Danny

     
    • Marc Strous

      Marc Strous - 2015-07-20

      Hi Danny,
      If you can make your project data available to me, I will check what
      goes wrong and provide a fix. Of course data will remain totally
      confidential...
      Am currently away from campus so will not be able to proceed until early
      Aug.
      Best!
      Marc

      On 07/10/15 06:18, Danny Ionescu wrote:

      Hi Marc

      I am having problems with the new bin optimization module of the new
      version. It finishes 1 iteration and then it gets stuck. I tried the
      same dataset in V3.3 and the optimization module did not get stuck.

      The data was assembled with MegaHit and the coverage is given by the
      map reads module.

      Thanks

      Danny


      Problem with bin optimization in V3.5
      https://sourceforge.net/p/metawatt/discussion/general/thread/1b94d496/?limit=25#d137


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/metawatt/discussion/general/
      https://sourceforge.net/p/metawatt/discussion/general

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      https://sourceforge.net/auth/subscriptions

      --
      Marc Strous
      CAIP Research Chair in Microbiology
      Energy Bioengineering Group
      Department of Geoscience
      University of Calgary
      www.ucalgary.ca/ebg
      el: (403) 220 6604
      EEEL-509

       
      • Sotirios Vasileiadis

        Hi Marc,

        To begin with, many thanks for the amazing pipeline (second time I am using it... hopefully the first time will bare some fruits soon). Since my issue is related to the bin optimization process I though I should post it here instead of opening a new post.

        In the current run I am processing a relatively large dataset (assembly: 5,157,130 contigs, max length 198,509, N50 of 779, total assembly length about 360Mbp... sequences were derived from 30 samples of the same environment allocating about 4 M 250bp paired-end illumina HiSeq 2500 rapid run reads per sample... the sequences were quality treamed prior assembly and also were used as input for the Metawatt run). By the way my previous run did not have replication or the magnitude of the current run... I therefore did not experience any issues.

        -My comuter specs are: 64 GB RAM (plus 64 GB swap), 24 Xeon cores, Ubuntu 14.04 LTS
        -I am running Metawatt 3.5.2 with 34 GB RAM at the moment (I reduced from 54 due to mapper related errors).

        The optimization step seems to be proceeding at the described processing rates in the module link you provide until the 9% of the run is completed and then it appears to be freezing for days (6 days in the last attempt until I stopped it). To be precise though, there are processes of java modules running according to the htop software during the wait, and there are no errors apearing (except for the last time where I go an 'Exception in thread "AWT-EventQueue-0" java.util.ConcurrentModificationException...' but I was going through the bins while running the software and I am not certain if this lead to the error) or registering in the log (attached). I have checked also for disk space issues that could be related to the swap and there is about 100 GB of free space.

        Given the above dataset size and specifications, is this delay expected or should I try something different?

        Thanks in advance for any associated information.

        Best wishes,
        Sotirios

         

        Last edit: Sotirios Vasileiadis 2016-06-09
        • Marc Strous

          Marc Strous - 2016-06-09

          Hi Sotirios,

          Thank you for trying metawatt and reporting this problem. I am aware
          that this module often malfunctions for larger datasets. Furthermore, I
          am currently unsatisfied with the performance of this module, which
          implements a series of heuristic tests and fixes to bins. I intend to
          fix the problem the next release of metawatt which is planned for later
          this year, time permitting.

          So, to make a long story short, at this moment I would strongly suggest
          to simply disable the optimization module... and run your pipeline
          without it.

          Best wishes,

          Marc

          On 06/08/2016 08:18 PM, Sotirios Vasileiadis wrote:

          Hi Marc,

          To begin with, many thanks for the amazing pipeline (second time I am
          using it... hopefully the first time will bare some fruits soon).
          Since my issue is related to the bin optimization process I though I
          should post it here instead of opening a new post.

          In the current run I am processing a relatively large dataset
          (assembly: 5,157,130 contigs, max length 198,509, N50 of 779, total
          assembly length about 360Mbp... sequences were derived from 30 samples
          of the same environment allocating about 4 M 250bp paired-end illumina
          HiSeq 2500 rapid run reads per sample... the sequences were quality
          treamed prior assembly and also were used as input for the Metawatt
          run). By the way my previous run did not have replication or the
          magnitude of the current run... I therefore did not experience any issues.

          -My comuter specs are: 64 GB RAM (plus 64 GB swap), 24 Xeon cores,
          Ubuntu 14.04 LTS
          -I am running Metawatt 3.5.2 with 34 GB RAM at the moment (I reduced
          from 54 due to mapper related errors).

          The optimization step seems to be proceeding at the described
          processing rates in the module link you provide until the 9% of the
          run is completed and then it appears to be freezing for days (6 days
          in the last attempt until I stopped it). To be precise though, there
          are processes of java modules running according to the htop software
          during the wait, and there are no errors apearing (except for the last
          time where I go an 'Exception in thread "AWT-EventQueue-0"
          java.util.ConcurrentModificationException...' but I was going through
          the bins while running the software and I am not certain if this lead
          to the error) or registering in the log (attached)

          Given the above dataset size and specifications, is this delay
          expected or should I try something different?

          Thanks in advance for any associated information.

          Best wishes,
          Sotirios


          Problem with bin optimization in V3.5
          https://sourceforge.net/p/metawatt/discussion/general/thread/1b94d496/?limit=25#d137/a2cf/f464


          Sent from sourceforge.net because you indicated interest in
          https://sourceforge.net/p/metawatt/discussion/general/

          To unsubscribe from further messages, please visit
          https://sourceforge.net/auth/subscriptions/

          --
          Marc Strous
          CAIP Research Chair in Microbiology
          Energy Bioengineering Group
          Department of Geoscience
          University of Calgary
          www.ucalgary.ca/ebg
          tel: (403) 220 6604
          2500 University Drive NW
          Calgary, AB, Canada T2N 1N4
          Energy Environment and Experiental Learning (EEEL) Building
          Room 509

           
          • Sotirios Vasileiadis

            Hi Marc,

            I will follow your advice and get back to you in case of any issues... looking forward to the next version.

            Best wishes,
            Sotirios

             

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.