#260 int variable overflow in gatekeeper Hash

gatekeeper
closed-wont-fix
None
5
2015-02-03
2013-10-25
Bin
No

Hello,

I got an error in gatekeeper when I use the integrated CA in MaSuRCA-2.0.3.1 assembler.

gatekeeper: AS_UTL_Hash.C:558: int LookupInHashTable_AS(HashTable_AS, uint64, uint32, uint64, uint32*): Assertion `bucket <= (int)table->numBuckets' failed.

I located the source code in "src/AS_UTL/AS_UTL_Hash.C" and added a print line in it.

assert(bucket >= 0);
if ( bucket > (int)table->numBuckets ) fprintf(stderr, "ERROR_LARGE ::: bucket %d : numBuckets %d\n", bucket, (int)table->numBuckets);
assert(bucket <= (int)table->numBuckets);

Then I got the ERROR message like this :

ERROR_LARGE ::: bucket 242171185 : numBuckets -2147483648

I wonder there could be an overflow of the int variable numBuckets because the Hash use a incremental memory allocation. And the minus number -2147483648 is just 64-bits.

64-bits
1111111111111111111111111111111110000000000000000000000000000000 -> -2147483648
0111111111111111111111111111111111111111111111111111111111111111 -> 9223372036854775807

9223372036854775807 is the maximum number of int in 64bits system.

The data I used is about 2 billion Reads of Maize Assembly, including Hiseq (PairEnd and MatePair), Miseq and 454 on a 64bit CentOS system.( CA on Linux-amd64 )

Any suggestion to solve this?
Should the int variable be changed to long in the future release?

Bin

Discussion

  • Sebastian Gornik

    Hi,

    I have the exact same problem:

    gatekeeper: AS_UTL_Hash.C:559: int LookupInHashTable_AS(HashTable_AS, uint64, uint32, uint64, uint32): Assertion `bucket <= (int)table->numBuckets' failed.

    *Failure message: gatekeeper failed

    Due to my lack of knowledge in scripting I have absolutely no idea what is wrong! I am running MasuRCA on a supercomputer cluster (VLSCI). With provided test data MasuRCA works, with >600 Mio reads error corrected within MasuRCA runCA fails with the above message.

    Help would be very much appreciated!

    Seb

     
  • Brian Walenz

    Brian Walenz - 2015-02-03
    • status: open --> closed-wont-fix
     
  • Brian Walenz

    Brian Walenz - 2015-02-03

    [Cleaning up old tickets]

    Unfortunately, the version of wgs-assembler included in masurca (even the lastest, 2.3.2b) is ancient, version 6.1 from April 2010. I don't think that wgs-assembler version can handle more than 1 billion reads (if even that many). The limit was increased to 2 billion in wgs-assembler 7.0 (January 2012).

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks