|
From: Bob H. <rsh...@gm...> - 2011-04-02 00:12:26
|
Wow, got back to this thread this afternoon and I see mucho discussion. Early this morning I had tried a few things with my test program and confirmed (to the extent such is possible) that the problem results from the apparent extra blocks allocated in valgrind's support of my program's repeated realloc's. Bart suggest that I try --freelist-vol=0, but upon doing so I did not see any difference (will try again to make sure). Phillippe suggested I try a patch ( https://bugs.kde.org/show_bug.cgi?id=250101 ). I will look into how much effort is involved. I haven't previously built valgrind from source. Sounds like it is in SVN (which I have used for my own stuff). Probably easy to build but I have a couple other ideas to try before I will go that route. I'm now going to focus on my original program, and I'm not planning to bother folks here with that until I have something useful to post. But I will give you a brief overview of why the program has that allocation pattern, and point you at the source code (only if you are curious), and then mention one idea I have to address that usage pattern. The program is a (fairly popular) DNA sequence aligner called lastz, written by me, which you can find here: http://www.bx.psu.edu/~rsharris/lastz/newer The latest version is 1.02.37 and I assume it still exhibits the problem but not in the normal build case. I discovered the problem last week (which technically predates the release) during routine testing on a large-memory build that I have recently added to the source but have not yet made available in the distributed Makefile. For anyone who is curious and wants to build it for the configuration that fails, let me know. (I certainly don't expect this, and this is why I didn't post any specifics about my real program in the first post). Anyway, the reason for its allocation pattern is that I was loading in the entire human genome, chromosome by chromosome. m14 in the demo program corresponds to one large string of ACGT (mostly) which will hold the concatenation of all the chromosomes. chr1 is about 250M, chr2 is a little smaller, and so on. Since the program doesn't know the overall size a priori, it reallocs the block as each chromosome "arrives". Because of the way the genome file is stored, the realloc pattern gets the biggest stuff first, and then at the end adds lots of little pieces (the genome file contains more than the usual 23 chromosomes, with a lot of little "chrumbs"). The later-allocated 12G block will hold an index of the positions of all the 12-letter words in the genome. This size is also related to the overall size of the genome, but at the point that I allocate it, I know the size of the genome. The usual build of the program, the build that people have been using for years, has a much smaller limit on m14. Typically the largest would be a single chromosome, so 250M for that, and that index is also smaller. The program was originally designed to fit in a 2G footprint. Now, what I intend to try, to work around the problem, is to predict the overall size and do a single allocation (not exactly rocket science, I know). Depending on the input file format I can derive this from the file before I start reading chromosomes. Or, I can allow the user to estimate it. For my immediate purpose, the user (me) will provide it. This won't change the fact that I still need to decide if I really have a problem. I think I probably do. Refreshing your memory, the real program didn't fail without valgrind, but segfaulted with it (see first post if you are curious). I don't think it's likely that the real program's failure is due to me failing to check for a NULL, as it was in the test case (another refresher: my program is supposed to check all allocation and commit suicide on any failure). The more I think about it, I think that the reallocation problem may be a red herring, and there may be some other problem with my code in the larger-memory build. My hope is that by allocating m14 in one call I will expose that hypothesized error. Thanks again for the attention this has gotten! Bob H |
|
From: WAROQUIERS P. <phi...@eu...> - 2011-04-04 07:49:07
|
>Bart suggest that I try --freelist-vol=0, but upon doing so I did not >see any difference (will try again to make sure). This is normal it makes (almost) no difference as the freelist-vol default value is small (20 Mb) (and the problem is rather related to the way Valgrind handles the realloc. > >Phillippe suggested I try a patch ( >https://bugs.kde.org/show_bug.cgi?id=250101 ). I will look into how >much effort is involved. I haven't previously built valgrind from >source. Sounds like it is in SVN (which I have used for my own >stuff). Probably easy to build but I have a couple other ideas to try >before I will go that route. The patch solves the problem on your small program (it runs till the end without (re-)alloc problems and without memleak. But for sure, a single allocation is much more efficient and less problematic. Philippe ____ This message and any files transmitted with it are legally privileged and intended for the sole use of the individual(s) or entity to whom they are addressed. If you are not the intended recipient, please notify the sender by reply and delete the message and any attachments from your system. Any unauthorised use or disclosure of the content of this message is strictly prohibited and may be unlawful. Nothing in this e-mail message amounts to a contractual or legal commitment on the part of EUROCONTROL, unless it is confirmed by appropriately signed hard copy. Any views expressed in this message are those of the sender. |