Menu

Input file limit

Help
2008-06-09
2013-05-20
  • Nobody/Anonymous

    Hi!
    Your program is fantastic.
    It's possible to input a fasta file with more seqs than the limit reported by MUMmer? I will reach that limit soon.

    # reading input file "input.fsa" of length 137431579
    # construct suffix tree for sequence of length 137431579
    # (maximum input length is 536870908)

    Thank you!!

    Alberto de Luis.

     
    • Nobody/Anonymous

      Sorry, that is the current maximum. To save space, mummer careful handles memory usage and is limited by the code to 536870908 bp trees. A simple solution is to run comparisons a single chromosome at a time, or break very large sequences into multiple subsequences and then union the resulting matches.

       
  • Nobody/Anonymous

    Hi,

    I have basically the same problem, using MUMmer 3.20 as part of the mugsy package. One of my organisms
    has 1.3 Gb assembled in several contigs. This results in the following error:

    mugsy_x86-64-v1r2.1/MUMmer3.20/mummer: suffix tree construction failed: textlen=1310126550 larger than maximal textlen=536870908

    I have a 130GB workstation, however. Shouldn't this be sufficient for the suffix tree?

     
  • Nobody/Anonymous

    Me again - I mean, I do understand that the limitation is caused by the code. But shouldn't it be possible to change some hard coded values to increase the limitation…?

     
  • Adam Phillippy

    Adam Phillippy - 2011-05-30

    If you recompile the package like so:

    > make clean
    > make CPPFLAGS="-O3 -DSIXTYFOURBITS"

    you should have no more problems with space limitations.

    Best,
    -Adam

     
  • Nobody/Anonymous

    thanks a lot adam!

    looks good so far.

    best
    alex

     
  • Nobody/Anonymous

    Hey,

    after doing this…this is what i get and the program just freezes.

    weill % ./mummer -mum -s -n -l 20 -L ../../public_html/seqs/E.coliK12.fasta ../../public_html/seqs/E.coliUMN026.fasta>mums
    # reading input file "../../public_html/seqs/E.coliK12.fasta" of length 4686137
    # construct suffix tree for sequence of length 4686137
    # (maximum reference length is 1073741820)
    # (maximum query length is 4294967295)
    # process 46861 characters per dot

     

Log in to post a comment.