Menu

#4 Unable to Manage a 1.8 Gb Assembly - Huge Intermediary Files

1.0
open
nobody
None
2017-06-28
2017-06-27
No

Hi There!

This isn't so much a bug as in feedback that I hit the wall binning a 1.8 Gb assembly with myCC. Not only did it run for over a week, it created MASSIVE intermediary files that caused kerfuffle with colleagues. It may be worth noting this limitation on your front page.

4.3G ./3_GetMatrix
1.4G ./2_GetFeature_4mer/Scf_layout
1.4G ./2_GetFeature_4mer/n_Scf_layout
2.7G ./2_GetFeature_4mer
4.3G ./3_GetMatrix_4mer
5.1M ./4_CLR_transformation_SNE/APout
1.1T ./4_CLR_transformation_SNE
1.4G ./2_GetFeature/Scf_layout
1.4G ./2_GetFeature/n_Scf_layout
2.7G ./2_GetFeature
883M ./7_AllClusters
19G ./5_ClusterCorrection
885M ./6_ClusterCtg
11M ./1_Data/output/hmmResults
11M ./1_Data/output/UCLUST
328K ./1_Data/output/temp
27M ./1_Data/output
6.6G ./1_Data
TOTAL 1.2T

Good luck,
Roli

Related

Tickets: #4

Discussion

  • OliverLin

    OliverLin - 2017-06-28

    Thank you for this information. Yes, MyCC did require disk space and memory for "Affinity propogation". Usually MyCC fails in running this kind of big assembly, do you have a large momery? (over 1T RAM? cool!). Any way, thank you for sharing.

     
    • Roland Wilhelm

      Roland Wilhelm - 2017-06-28

      Hi Oliver,

      To clarify it takes up > 1 Tb of hard drive space, not RAM.

      What is the largest assembly that myCC has handled (just for reference).

      Thanks,

      On Tue, Jun 27, 2017 at 9:25 PM, OliverLin oliverlinnhri@users.sf.net
      wrote:

      Thank you for this information. Yes, MyCC did require disk space and
      memory for "Affinity propogation". Usually MyCC fails in running this kind
      of big assembly, do you have a large momery? (over 1T RAM? cool!). Any way,
      thank you for sharing.


      Status: open
      Milestone: 1.0
      Created: Tue Jun 27, 2017 04:29 PM UTC by Roland Wilhelm
      Last Updated: Tue Jun 27, 2017 04:29 PM UTC
      Owner: nobody

      Hi There!

      This isn't so much a bug as in feedback that I hit the wall binning a 1.8
      Gb assembly with myCC. Not only did it run for over a week, it created
      MASSIVE intermediary files that caused kerfuffle with colleagues. It may be
      worth noting this limitation on your front page.

      4.3G ./3_GetMatrix
      1.4G ./2_GetFeature_4mer/Scf_layout
      1.4G ./2_GetFeature_4mer/n_Scf_layout
      2.7G ./2_GetFeature_4mer
      4.3G ./3_GetMatrix_4mer
      5.1M ./4_CLR_transformation_SNE/APout
      1.1T ./4_CLR_transformation_SNE
      1.4G ./2_GetFeature/Scf_layout
      1.4G ./2_GetFeature/n_Scf_layout
      2.7G ./2_GetFeature
      883M ./7_AllClusters
      19G ./5_ClusterCorrection
      885M ./6_ClusterCtg
      11M ./1_Data/output/hmmResults
      11M ./1_Data/output/UCLUST
      328K ./1_Data/output/temp
      27M ./1_Data/output
      6.6G ./1_Data
      TOTAL 1.2T

      Good luck,
      Roli


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/sb2nhri/tickets/4/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Tickets: #4

      • OliverLin

        OliverLin - 2017-06-28

        Hi Roli,
        It should takes more than 1T RAM. Please check the file size of "Similarities.txt" in 4_CLR_transformation_SNE. AP (Affinity propagation) takes this file usin memory. So, if you did not have enough RAM, you would not run AP successfully.
        I tested MyCC with 1G assembly, it worked. However, it depends on how many contigs you have , not on how large the assembly is. For your binning, did you try MetaBAT? It works for large dataset.
        Thank you

         
        • Roland Wilhelm

          Roland Wilhelm - 2017-06-28

          Thanks,

          I'll try with a higher cut-off for contig length.

          R

          On Tue, Jun 27, 2017 at 10:05 PM, OliverLin oliverlinnhri@users.sf.net
          wrote:

          Hi Roli,
          It should takes more than 1T RAM. Please check the file size of
          "Similarities.txt" in 4_CLR_transformation_SNE. AP (Affinity propagation)
          takes this file usin memory. So, if you did not have enough RAM, you would
          not run AP successfully.
          I tested MyCC with 1G assembly, it worked. However, it depends on how many
          contigs you have , not on how large the assembly is. For your binning, did
          you try MetaBAT? It works for large dataset.
          Thank you


          Status: open
          Milestone: 1.0
          Created: Tue Jun 27, 2017 04:29 PM UTC by Roland Wilhelm
          Last Updated: Wed Jun 28, 2017 01:25 AM UTC
          Owner: nobody

          Hi There!

          This isn't so much a bug as in feedback that I hit the wall binning a 1.8
          Gb assembly with myCC. Not only did it run for over a week, it created
          MASSIVE intermediary files that caused kerfuffle with colleagues. It may be
          worth noting this limitation on your front page.

          4.3G ./3_GetMatrix
          1.4G ./2_GetFeature_4mer/Scf_layout
          1.4G ./2_GetFeature_4mer/n_Scf_layout
          2.7G ./2_GetFeature_4mer
          4.3G ./3_GetMatrix_4mer
          5.1M ./4_CLR_transformation_SNE/APout
          1.1T ./4_CLR_transformation_SNE
          1.4G ./2_GetFeature/Scf_layout
          1.4G ./2_GetFeature/n_Scf_layout
          2.7G ./2_GetFeature
          883M ./7_AllClusters
          19G ./5_ClusterCorrection
          885M ./6_ClusterCtg
          11M ./1_Data/output/hmmResults
          11M ./1_Data/output/UCLUST
          328K ./1_Data/output/temp
          27M ./1_Data/output
          6.6G ./1_Data
          TOTAL 1.2T

          Good luck,
          Roli


          Sent from sourceforge.net because you indicated interest in
          https://sourceforge.net/p/sb2nhri/tickets/4/

          To unsubscribe from further messages, please visit
          https://sourceforge.net/auth/subscriptions/

           

          Related

          Tickets: #4

  • OliverLin

    OliverLin - 2017-06-28

    Yes, please set a a higher cut-off for contig length.
    Additionaly, you can lower -lt or -st.

     

Log in to post a comment.

MongoDB Logo MongoDB