|
From: Umut T. <umu...@gm...> - 2012-08-17 07:39:55
|
As a follow up of my previous post, since I made a mistake on the settings, I can not directly reply your messages, But below are the messages to the gentle reply owners ;-) @ John, Here is the information you requested, $ sed /SwapFree/q </proc/meminfo MemTotal: 2048124 kB MemFree: 172108 kB Buffers: 254732 kB Cached: 875340 kB SwapCached: 5688 kB Active: 884744 kB Inactive: 713332 kB Active(anon): 339448 kB Inactive(anon): 273668 kB Active(file): 545296 kB Inactive(file): 439664 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 1992700 kB SwapFree: 1946540 kB A large file, which for instance valgrind hangs, is a file of size 14.7MB @ Milian, I am aware that programs can work far slower in valgrind, but I guess that is not the case at the moment, at least here. Here is a sample output from the time command for a normal run that makes valgrind hang out, 47740 real 0m1.341s user 0m1.160s sys 0m0.148s As a side note, the first number is the number or rows/cols of the matrix. So regarding these numbers I would expect two get a result in 2-3 minutes. Best regards, Umut |
|
From: John R. <jr...@bi...> - 2012-08-17 14:06:04
|
> $ sed /SwapFree/q </proc/meminfo > > MemTotal: 2048124 kB > SwapTotal: 1992700 kB > SwapFree: 1946540 kB So the box has 2GB of RAM and almost 2GB of Swap. But we still don't know if this is a 32-bit i686 or a 64-bit x86_64 (or something else!) Which is it? > > A large file, which for instance valgrind hangs, is a file of size 14.7MB > > 47740 > > real 0m1.341s > user 0m1.160s > sys 0m0.148s > > As a side note, the first number is the number or rows/cols of the matrix. So (rows * cols) is 47740**2 which is more than 2G. Already it won't fit on a 32-bit machine with even 'short' entries (each 16-bit: 2 byte). How many bytes per entry? And then memcheck adds its overhead, which in difficult cases is another 1.25 times (total factor of 2.25). When valgrind hangs, what does "ps -alx | grep my_app" (in another window) say in the VSZ (virtual size) and RSS (resident set size) columns? There *is* a problem about size. Please tell us everything you know about the address space requirements of this app. It's very difficult for us to help you when you don't tell us the basic info. -- |
|
From: Umut T. <umu...@gm...> - 2012-08-17 14:47:15
Attachments:
leak_log_
|
On 08/17/2012 04:06 PM, John Reiser wrote: >> $ sed /SwapFree/q</proc/meminfo >> >> MemTotal: 2048124 kB >> SwapTotal: 1992700 kB >> SwapFree: 1946540 kB > So the box has 2GB of RAM and almost 2GB of Swap. > But we still don't know if this is a 32-bit i686 or a 64-bit x86_64 > (or something else!) Which is it? Hi John, It is an x86_64 and here is the 'uname -a' output Linux dutw689 3.0.0-21-generic #35-Ubuntu SMP Fri May 25 17:57:41 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux > So (rows * cols) is 47740**2 which is more than 2G. Already it won't fit > on a 32-bit machine with even 'short' entries (each 16-bit: 2 byte). Careful catch, it is a sparse matrix where only non-zero(nnz) entries are kept. And the matrix is symmetric which reveals only the upper triangular part is kept so with these considerations in the mind, nnz element count is 631996 which are 'double's. So there are 3 main arrays to keep the matrix in memory if you are aware of the storage formats such as CSR/CSC which in this case is a CSR matrix. More specifically, you can see the details of this format on http://en.wikipedia.org/wiki/Sparse_matrix so basically, one has to keep three arrays in memory, row pointers, column indices and values. Row pointers: is an integer array of size 47741, int[47471] Col indiceds: is an integer array of size 631996(nnz of upper triangular), int[631996] Value array: is an array of doubles, double[631996] These are the basic memory requirements, however there might also be some other requirements due to use of matrix libraries which are heavily dependent on STL, I guess. The above story is the necessary user input to run the code. I hope up to this point, it is clear, OK and enough. > How many bytes per entry? And then memcheck adds its overhead, > which in difficult cases is another 1.25 times (total factor of 2.25). ok, that is good to know. > > When valgrind hangs, what does "ps -alx | grep my_app" (in another window) > say in the VSZ (virtual size) and RSS (resident set size) columns? there is already a running process on our computational cluster, which normally should have been finished by now and the log file that is written up to this point is also attached. what I get with ps aux | grep wsmp_class_test on the cluster is utabak 8484 99.9 0.2 525724 432772 ? R 14:41 119:33 valgrind -v --leak-check=full --track-origins=yes --log-file=leak_log_ /home/utabak/C++/numericTests/linear_systems/wsmp_class_test /home/utabak/C++/numericTests/linear_systems/Kaa_365695.bin utabak 9388 1.0 0.0 103232 920 pts/0 S+ 16:40 0:00 grep wsmp_class_test However, I am still reading about VSZ and RSS, excuse me if that info is not in the above lines. Let me know if there are further info required. Best, Umut |
|
From: John R. <jr...@bi...> - 2012-08-17 15:49:37
|
>> So the box has 2GB of RAM and almost 2GB of Swap. > It is an x86_64 and here is the 'uname -a' output > > Linux dutw689 3.0.0-21-generic #35-Ubuntu SMP Fri May 25 17:57:41 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux >> So (rows * cols) is 47740**2 which is more than 2G. > Careful catch, it is a sparse matrix where only non-zero(nnz) entries are kept. And the matrix is symmetric ... > so basically, one has to keep three arrays in memory, row pointers, column indices and values. > > Row pointers: is an integer array of size 47741, int[47471] > Col indiceds: is an integer array of size 631996(nnz of upper triangular), int[631996] > Value array: is an array of doubles, double[631996] Thank you! That info is important for estimating RAM requirements. > utabak 8484 99.9 0.2 525724 432772 ? R 14:41 119:33 valgrind -v --leak-check=full --track-origins=yes --log-file=leak_log_ /home/utabak/C++/numericTests/linear_systems/wsmp_class_test /home/utabak/C++/numericTests/linear_systems/Kaa_365695.bin That says VSZ= 525MB of virtual space and RSS= 432MB of actual RAM touched recently. These are 1/4 to 1/5 of physical RAM, which is perhaps OK. The ratio 432/525 is somewhat close to 1.0. This might be an indication of memory access patterns that could cause many cache misses, but perhaps not. OK, with that data we now have some basis for believing that memcheck isn't exhausting the gross physical resources of the machine. So, we start looking for something else. Try to rule out "too many system calls" by running with "valgrind --trace-syscalls=yes ..." (pay attention only to the most recent 20 or so system calls), and/or in a separate text window run "vmstat 5" and watch the cpu.sy percentage. Also watch for any dramatic change in other columns between valgrind running and valgrind not running. -- |
|
From: Umut T. <umu...@gm...> - 2012-08-17 16:30:33
|
On 08/17/2012 05:50 PM, John Reiser wrote: > > That says VSZ= 525MB of virtual space and RSS= 432MB of actual RAM touched recently. > These are 1/4 to 1/5 of physical RAM, which is perhaps OK. The ratio 432/525 > is somewhat close to 1.0. This might be an indication of memory access patterns > that could cause many cache misses, but perhaps not. > > OK, with that data we now have some basis for believing that memcheck > isn't exhausting the gross physical resources of the machine. > So, we start looking for something else. > > Try to rule out "too many system calls" by running with "valgrind --trace-syscalls=yes ..." > (pay attention only to the most recent 20 or so system calls), Dear John, Well after some initialization, what I get in the log file with valgrind -v --leak-check=full --track-origins=yes --trace-syscalls=yes --log-file=leak_log_ ./wsmp_class_test Kaa_365695.bin is on http://pastebin.com/n60pN1LL I had to stop the process after some run up with Ctrl+C to limit the file size. > and/or in a separate text window run "vmstat 5" and watch the cpu.sy percentage. > Also watch for any dramatic change in other columns between valgrind running and > valgrind not running. > I will try this at home since I have to leave now. I will let you know about the outcome if the above result is not enough. Best, Umut |
|
From: John R. <jr...@bi...> - 2012-08-17 19:29:00
|
> http://pastebin.com/n60pN1LL The line SYSCALL[11763,1]( 0) sys_read ( 4, 0x90b2e34, 38346752 ) appears many many times. A read() of 38MB is done over and over and over again. That's expensive. A read()-from-a-device is a write-to-memory, so memcheck must mark those 38MB as valid, over and over and over again. -- |
|
From: Philippe W. <phi...@sk...> - 2012-08-18 12:15:50
|
On Fri, 2012-08-17 at 12:29 -0700, John Reiser wrote: > > http://pastebin.com/n60pN1LL > > The line > > SYSCALL[11763,1]( 0) sys_read ( 4, 0x90b2e34, 38346752 ) > > appears many many times. A read() of 38MB is done over and over and over again. > That's expensive. A read()-from-a-device is a write-to-memory, > so memcheck must mark those 38MB as valid, over and over and over again. I am not sure the read syscall is successful. here are some succesful read: SYSCALL[11763,1]( 0) sys_read ( 4, 0x7fee9a740, 1458176 ) --> [async] ... SYSCALL[11763,1]( 0) ... [async] --> Success(0x0:0x164000) but then a bunch of lines like: SYSCALL[11763,1]( 0) sys_read ( 4, 0x90b2e34, 38346752 ) --> [async] ... SYSCALL[11763,1]( 15) sys_rt_sigreturn ( ) --> [pre-success] NoWriteResult without indication of successful result. It is not very clear to what is happening. If you run the same thing with 3.8.0, when it hangs, you might take a look with vgdb at what your application is doing, either from the command line: vgdb v.info scheduler or in gdb: gdb target remote | vgdb ... and then backtrace and similar Alternatively, using -v -v -v -d -d -d --trace-syscalls=yes --trace-signals=yes might give some more lights. Philippe Philippe |