Menu

#9 Segfault in hash_vector()

open
nobody
None
7
2011-07-25
2011-06-04
Zamin Iqbal
No

I've just been trying a ltest program which is modelled on the first example in your README (but ported
to a different context). I'm using cmph-1.1

Here is the backtrace from gdb

gdb) bt
#0 0x00007ffff7939d14 in hash_vector () from /home/zam/bin/lib/libcmph.so.0
#1 0x00007ffff7949305 in brz_bmz8_search () from /home/zam/bin/lib/libcmph.so.0
#2 0x00007ffff794966d in brz_search () from /home/zam/bin/lib/libcmph.so.0
#3 0x00007ffff793d75b in cmph_search () from /home/zam/bin/lib/libcmph.so.0
#4 0x0000000000402007 in get_object_name_and_add_to_big_array (node=0x7ffff43b4dd0) at src/cortex_var/many_colours/hash_collection.c:161
#5 0x000000000040ac8a in iterate_my_objects (f=0x7fffffff0c64, hash_table=0x63f0b0) at src/hash_table/open_hash/hash_table.c:211
#6 0x0000000000401e9f in my_function (colour_to_populate=0, colour_in_dbg=0, psh=0x11e903200, fancy_sup_hash=0x11e802f20, path_nodes=0x11e802f60, path_orientations=0x11e8167f0,
path_labels=0x11e820440, supernode_str=0x11e82a090 "ACGGAGCAGGTCAAAACTCCCGTGCTGATCAGTAGTGGGATCGCGCCTGTGAATAGCCACTGCACTCCAGCCTGAGCAACATAGCGAGACCCCGTCTCTT", max_length=10000)
at src/cortex_var/many_colours/hash_collection.c:167
#7 0x0000000000403d86 in main (argc=14, argv=0x7fffffffe528) at src/cortex_var/many_colours/cortex_var.c:796

The program looks like this:

<<<<<<<<<<<< point A (referred to later)

FILE* mphf_fd = fopen("temp.mph", "w");
// Source of keys
cmph_io_adapter_t *source = cmph_io_vector_adapter((char **)vector, nkeys);

//Create minimal perfect hash function using the brz algorithm.
cmph_config_t *config = cmph_config_new(source);
cmph_config_set_algo(config, CMPH_BRZ);
cmph_config_set_mphf_fd(config, mphf_fd);
cmph_t *hash = cmph_new(config);
cmph_config_destroy(config);
cmph_dump(hash, mphf_fd);
cmph_destroy(hash);
fclose(mphf_fd);//hash saved for future use

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< point B (referred to later)

mphf_fd = fopen("temp.mph", "r");
fancy_sup_hash = cmph_load(mphf_fd);

So far, just a copy of your example. The segfault happens soon after:

I call a function of my own

my_function(big_array, fancy_sup_hash, other data)

The idea being that I have now got a perfect minimal hash taking key to a unique index.
I can then fill a big array of structs with info that I am interested in, where
info relevant to key goes in big_array[ index corresponding to key - ie hash value of key]

If we look in my_function we see, in partially pseudocode

1. iterate through all the objects I want to index and store
2. for each one:

name = get_key_of_object(object);
int index = cmph_search(fancy_sup_hash, name, (cmph_uint32)strlen(name));

big_array[index].member_data = blah

Do you have any idea what can be going on?

One other thing - when I run through valgrind, it throws lots of warnings in the section of code that is a direct copy of
your example (between marks A and B, marked with <<<<<<<<< above)

==41969== Invalid read of size 8
==41969== at 0x5358E66: ??? (in /lib/libc-2.12.1.so)
==41969== by 0x50B9795: key_vector_read (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C4C1C: brz_gen_mphf (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C48BC: brz_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50BA3E4: cmph_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403AF2: main (cortex_var.c:745)
==41969== Address 0x8c59f28 is 24 bytes inside a block of size 31 alloc'd
==41969== at 0x4C28177: malloc (vg_replace_malloc.c:195)
==41969== by 0x403887: main (cortex_var.c:700)
==41969==
==41969== Invalid read of size 8
==41969== at 0x5400430: ??? (in /lib/libc-2.12.1.so)
==41969== by 0x50B97E0: key_vector_read (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C4C1C: brz_gen_mphf (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C48BC: brz_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50BA3E4: cmph_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403AF2: main (cortex_var.c:745)
==41969== Address 0x8c59f28 is 24 bytes inside a block of size 31 alloc'd
==41969== at 0x4C28177: malloc (vg_replace_malloc.c:195)
==41969== by 0x403887: main (cortex_var.c:700)
==41969==
==41969== Invalid read of size 8
==41969== at 0x5358E52: ??? (in /lib/libc-2.12.1.so)
==41969== by 0x50C51E2: brz_gen_mphf (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C48BC: brz_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50BA3E4: cmph_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403AF2: main (cortex_var.c:745)
==41969== Address 0x275b47888 is 8 bytes inside a block of size 10 alloc'd
==41969== at 0x4C2749B: calloc (vg_replace_malloc.c:418)
==41969== by 0x50C4466: brz_config_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50B9F6A: cmph_config_set_algo (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403ACA: main (cortex_var.c:743)
==41969==
==41969== Invalid read of size 8
==41969== at 0x5358E52: ??? (in /lib/libc-2.12.1.so)
==41969== by 0x50C5477: brz_gen_mphf (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50C48BC: brz_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50BA3E4: cmph_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403AF2: main (cortex_var.c:745)
==41969== Address 0x275b47888 is 8 bytes inside a block of size 10 alloc'd
==41969== at 0x4C2749B: calloc (vg_replace_malloc.c:418)
==41969== by 0x50C4466: brz_config_new (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x50B9F6A: cmph_config_set_algo (in /Net/fs1/home/zam/bin/lib/libcmph.so.0.0.0)
==41969== by 0x403ACA: main (cortex_var.c:743)

Discussion

  • Zamin Iqbal

    Zamin Iqbal - 2011-07-25
    • priority: 5 --> 7
     
  • Zamin Iqbal

    Zamin Iqbal - 2011-07-25

    Any chance of a comment please? Even if you think it's a bad bug report, or not a bug, it's blocking me and feedback is helpful
    regards
    Zam

     
  • Zamin Iqbal

    Zamin Iqbal - 2011-11-08

    OK - ihave a workaround for this bug. Replace this

    cmph_config_set_algo(config, CMPH_BRZ)

    with this

    cmph_config_set_algo(config, CMPH_CHD)

     
  • Fabiano C. Botelho

    Hi,
    Sorry for the long long long delay :). I had not paid attention in the bug tracker ever. Anyways, I will from now on. Yes, changing the algorithm works but I could not reproduce this bug. Here is the hash_vector code:

    void hash_vector(hash_state_t *state, const char *key, cmph_uint32 keylen, cmph_uint32 * hashes)
    {
    switch (state->hashfunc)
    {
    case CMPH_HASH_JENKINS:
    jenkins_hash_vector_((jenkins_state_t *)state, key, keylen, hashes);
    break;
    default:
    assert(0);
    }
    }

    Would you be able to tell me where (which line) of that function the segfault happened?
    F.

     
  • Fabiano C. Botelho

    We had a few problems in the CMPH_BRZ implementation when the input set was too small (less than 10 keys). How big was your input set? I have fixed this problem and it will be out in the next release. You can get it from the git repo in SF though.
    F.

     

Log in to post a comment.