From: Guanglin X. <gua...@an...> - 2012-08-09 21:38:45
|
Hi Gene, Sorry for my late reply because I was in my vacation those days, too. I would like to post a fix now. It works well on my computer. Comments: 1. I take advantage of some codes in dmtcp/src/util_exec.cpp. Finally, I use personality() - the cleanset way to force legacy_va_layout, because the bug stated in void dmtcp::Util::adjustRlimitStack() doesn't occur on my computer. 2. I use some glibc functions, because I think these routines are actually nothing to do with checkpointing itself. Hence, there is no compatibility problems. It's my honor to make some contributions to DMTCP project. Please give more comments. Thanks. Guanglin garin@ubuntu:~/workspace$ diff dmtcp-1.2.5/mtcp/mtcp.c mtcp/mtcp.c 78a79 > #include <sys/personality.h> 647a649,708 > > /***************************************************************************** > * > * This routine restarts the process, with the same argv, environ... > * > *****************************************************************************/ > static void restart_self() { > #define MAX_ARGS 500 > > char *argv[MAX_ARGS+1]; > FILE* fp; > if (NULL == (fp= fopen("/proc/self/cmdline", "r"))) { > MTCP_PRINTF("error openning /proc/self/cmdline\n"); > } > > char strings[10001] = "\0"; > int num_read = fread(strings, 1, 1023, fp); > fclose(fp); > > char *str = strings; > int i; > for (i = 0; str - strings < num_read && i < MAX_ARGS; i++) { > argv[i] = str; > while (*str++ != '\0') > ; > } > argv[i] = NULL; > > execv(argv[0], argv); /* should never return */ > } > > /***************************************************************************** > * > * This routine guarantees the process using ADDR_COMPAT_LAYOUT by restarting. > * > * We need it because MTCP can't handle address confliction between mapped > * libraries in checkpointed process and [vdso] in mtcp_restart. So we decide > * to exploit the 2 different memory mapping layout. > * Hence, mtcp_restart maps its [vdso] to lower area while checkpointed process > * maps its libraries to higher area. > * > *****************************************************************************/ > static void set_compat_memory_layout() { > int pers = personality(0xffffffffUL); /* get current personality */ > if (!(pers & ADDR_COMPAT_LAYOUT)) { /* if no compat layout ... */ > DPRINTF("no compat layout\n"); > > if (pers == personality(pers | ADDR_COMPAT_LAYOUT)) { > /* if successfully set personality with compat layout */ > DPRINTF("restart with compat layout\n"); > restart_self(); /* should never return */ > MTCP_PRINTF("error restart process with compat layout\n"); > exit(0); > } else { > MTCP_PRINTF("error set personality\n"); > exit(0); > } > } > } > 671a733,734 > set_compat_memory_layout(); > > Thanks. That's a good catch. I had known that 'GNU make' forces a > certain memory layout. I didn't realize that MTCP 'make check' > was depending on that for its success. As you say, that is a bug in MTCP. > > I agree that when MTCP is running standalone, it should do something > like enhancing mtcp_init in a manner similar to DMTCP. > > I'm going to be travelling soon, and won't have time to seriously fix > this during the next two weeks. If you're interested in doing a first > version of the fix, I promise to give you comments, and to add the > fix to DMTCP (and to give you credit in a comment, if you would like > that). > > Best wishes, > - Gene > > On Wed, Aug 01, 2012 at 08:12:59PM -0400, Guanglin Xu wrote: >> Hi Gene, >> >> I try >> > cd mtcp; make check >> >> it works quite well. However, it's none of Makefile's business. >> >> The secret is that GNU make-3.81(maybe any other versions do the same, >> too) forces its child a legacy virtual memory layout, where mmap regions >> are higher. >> >> So when I tried GNU make-3.75, it fails occasionally. Moreover, our >> Makefile script - testcase - did not detect this failure at all. >> >> Please give more comments:) >> >> Thanks. >> Guanglin >> >> >> > Hi Guanglin, >> > >> > I think only the first solution may work: >> >> Alternative 1: Employ the same solution as DMTCP, by enhancing >> >> mtcp_init(). >> > >> > You can't get rid of the new vdso because the kernel will have created >> > the restarted process in such a way that the process will look >> > for the new vdso. >> > >> > Also, have you tried: >> > cd mtcp; make check >> > It will interactively ask you to type something (with carriage >> return), >> > and then restart. >> > If that works, look at the Makefile to see why it worked. >> > >> > In the end, if you decide to enhance mtcp_init(), please give us a >> copy. >> > If this is a bug on 32-bit MTCP (not DMTCP), we'd like to incorporate >> a >> > fix. >> > (As I say, 'make check' in MTCP works fine here on 32-bit Ubuntu >> 9.10.) >> > >> > Thanks, >> > - Gene >> > >> > On Wed, Aug 01, 2012 at 12:18:52PM -0400, Guanglin Xu wrote: >> >> Hi Gene, >> >> >> >> Thank you for your reply. >> >> >> >> In fact, DMTCP works quite well on my computer while MTCP dosen't. >> >> >> >> However, I think I have realized the cause of the problem after >> reading >> >> http://lwn.net/Articles/91829/ >> >> >> >> Here is my comprehension: >> >> (1)The policy of where [vdso] can be loaded is the same as normal >> shared >> >> libraries'. >> >> (2)A solution to avoid [vdso] confliction is making sure that the >> memory >> >> regions where shared libraries can be loaded are diffrent between the >> >> raw >> >> process and the restarting process, by exploiting 2 different virtual >> >> address space layout. >> >> (3)Hence, DMTCP forces the raw process to load its libraries to >> >> higher(>=0x40000000) addresses, while the restarting process is >> forced >> >> to >> >> load its libraries to lower(>=0x110000) addresses(moreover, loads >> [vdso] >> >> at 0x110000 ??) so as to achieve non-confliction. >> >> (4)The same (precisely just the first half) solution hasn't been >> >> employed >> >> by MTCP now, so [vdso] conflicts with earlier libraries occasionally >> on >> >> my >> >> computer. >> >> >> >> And my plan to fix MTCP: >> >> Alternative 1: Employ the same solution as DMTCP, by enhancing >> >> mtcp_init(). >> >> Alternative 2: Is it possible to get rid of the new [vdso] while >> >> restarting? But I don't know why mtcp_restart turns on the new [vdso] >> >> after a checkpointing. >> >> >> >> Do you think I am right? Please give comments:) >> >> >> >> Thanks, >> >> Guanglin >> >> >> >> >> >> > Hi Guanglin, >> >> > In your Ubuntu 10.4 (32-bit), does DMTCP always fail (even on >> >> > small programs)? Are you using DMTCP or just MTCP or just >> libmtcp.so >> >> > in isolation? >> >> > Locally, we have available a 32-bit Ubuntu 9.10 system >> >> > (one release earlier than 10.4), and DMTCP does work on this. >> DMTCP >> >> was >> >> > also tested on 32-bit Ubuntu 11.10 and works. >> >> > By the way, vdso is usually an issue only on 32-bit machines, >> >> > due to the smaller address space. When DMTCP restarts, it must >> make >> >> sure >> >> > that the new vdso (generated by the kernel) does not conflict with >> >> > regions mapped prior to checkpoint. >> >> > One quick workaround that you could try would be to turn off >> ASLR. >> >> > To do this, as root do: >> >> > echo 0 > /proc/sys/kernel/randomize_va_space >> >> > A more drastic solution is to turn off vdso: >> >> > echo 0 > /proc/sys/vm/vdso_enabled >> >> > >> >> > If you can provide an unprivileged guest account (sent to me >> --- >> >> > not to the full list) with your 32-bit Ubuntu 10.04 setup, I'd be >> >> happy >> >> > to look at your machine and give you a more detailed diagnosis on >> how >> >> > to make MTCP work without changing your proc system parameters. >> >> > >> >> > Best wishes, >> >> > - Gene >> >> > >> >> > On Mon, Jul 30, 2012 at 07:31:52PM -0400, Guanglin Xu wrote: >> >> >> Hi, >> >> >> I am really interested in libmtcp.so, but now I get trouble with >> it. >> >> >> >> >> >> According to the comments from mtcp.c, >> >> >> --"We must also keep the new vdso segment, provided by >> >> mtcp_restart.", >> >> >> but what if a restored library want to occupy this memory region? >> >> >> >> >> >> Unfortunately, on my Ubuntu 10.04 32bit, mtcp_restart always maps >> >> [vdso] >> >> >> into 0x110000 (btw, how do you disable ASLR here?) , where >> >> confliction >> >> >> occurs very frequently and cause segmentation error. >> >> >> >> >> >> I think it is a bug of mtcp. Can you fix it? or give comments so I >> >> can >> >> >> do it? >> >> >> >> >> >> thank you. >> >> >> >> >> >> Guanglin >> >> >> >> >> >> >> >> >> I'd love to give 2 memory maps here: >> >> >> garin@ubuntu:~/workspace/testmtcpsimple_pthread/Release$ cat >> >> >> /proc/`pidof >> >> >> testmtcp3`/maps >> >> >> 00110000-00263000 r-xp 00000000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so >> >> >> 00263000-00265000 r--p 00153000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so >> >> >> 00265000-00266000 rw-p 00155000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so >> >> >> 00266000-00269000 rw-p 00000000 00:00 0 >> >> >> 00269000-0028d000 r-xp 00000000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 0028d000-0028e000 r--p 00023000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 0028e000-0028f000 rw-p 00024000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 00297000-00298000 r-xp 00000000 00:00 0 [vdso] >> >> >> 005f9000-0060e000 r-xp 00000000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 0060e000-0060f000 r--p 00014000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 0060f000-00610000 rw-p 00015000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 00610000-00612000 rw-p 00000000 00:00 0 >> >> >> 00a39000-00a54000 r-xp 00000000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00a54000-00a55000 r--p 0001a000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00a55000-00a56000 rw-p 0001b000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00b6d000-00b86000 r-xp 00000000 07:00 1448440 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/libmtcp.so.1.0.0 >> >> >> 00b86000-00b87000 r--p 00018000 07:00 1448440 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/libmtcp.so.1.0.0 >> >> >> 00b87000-00b88000 rw-p 00019000 07:00 1448440 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/libmtcp.so.1.0.0 >> >> >> 00b88000-00b95000 rw-p 00000000 00:00 0 >> >> >> 00dac000-00dae000 r-xp 00000000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 00dae000-00daf000 r--p 00001000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 00daf000-00db0000 rw-p 00002000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 08048000-08049000 r-xp 00000000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 08049000-0804a000 r--p 00000000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 0804a000-0804b000 rw-p 00001000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 0804b000-0804c000 rw-p 00000000 00:00 0 >> >> >> 08936000-08957000 rw-p 00000000 00:00 0 [heap] >> >> >> b576c000-b576d000 ---p 00000000 00:00 0 >> >> >> b576d000-b5f6d000 rw-p 00000000 00:00 0 >> >> >> b5f6d000-b5f6e000 ---p 00000000 00:00 0 >> >> >> b5f6e000-b676e000 rw-p 00000000 00:00 0 >> >> >> b676e000-b676f000 ---p 00000000 00:00 0 >> >> >> b676f000-b6f6f000 rw-p 00000000 00:00 0 >> >> >> b6f6f000-b6f70000 ---p 00000000 00:00 0 >> >> >> b6f70000-b7772000 rw-p 00000000 00:00 0 >> >> >> b7781000-b7783000 rw-p 00000000 00:00 0 >> >> >> bfed2000-bfee7000 rw-p 00000000 00:00 0 [stack] >> >> >> >> >> >> garin@ubuntu:~/workspace/testmtcpsimple_pthread/Release$ cat >> >> >> /proc/`pidof >> >> >> mtcp_restart`/maps >> >> >> 00110000-00263000 r-xp 00000000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so ##before restores, >> here >> >> >> lies >> >> >> [vdso] section. so conflicts now! >> >> >> 00263000-00265000 r--p 00153000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so >> >> >> 00265000-00266000 rw-p 00155000 07:00 1569891 >> >> >> /lib/tls/i686/cmov/libc-2.11.1.so >> >> >> 00266000-00269000 rw-p 00000000 00:00 0 >> >> >> 00269000-0028d000 r-xp 00000000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 0028d000-0028e000 r--p 00023000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 0028e000-0028f000 rw-p 00024000 07:00 1569902 >> >> >> /lib/tls/i686/cmov/libm-2.11.1.so >> >> >> 00297000-00298000 r-xp 00000000 00:00 0 >> >> >> 005f9000-0060e000 r-xp 00000000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 0060e000-0060f000 r--p 00014000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 0060f000-00610000 rw-p 00015000 07:00 1569842 >> >> >> /lib/tls/i686/cmov/libpthread-2.11.1.so >> >> >> 00610000-00612000 rw-p 00000000 00:00 0 >> >> >> 00a39000-00a54000 r-xp 00000000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00a54000-00a55000 r--p 0001a000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00a55000-00a56000 rw-p 0001b000 07:00 1580341 /lib/ld-2.11.1.so >> >> >> 00b6d000-00b95000 rwxp 00000000 00:00 0 >> >> >> 00dac000-00dae000 r-xp 00000000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 00dae000-00daf000 r--p 00001000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 00daf000-00db0000 rw-p 00002000 07:00 1569825 >> >> >> /lib/tls/i686/cmov/libdl-2.11.1.so >> >> >> 08048000-08049000 r-xp 00000000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 08049000-0804a000 r--p 00000000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 0804a000-0804b000 rw-p 00001000 07:00 1446125 >> >> >> /home/garin/testground/dmtcp-1.2.5/mtcp/testmtcp3 >> >> >> 0804b000-0804c000 rw-p 00000000 00:00 0 >> >> >> 08936000-08957000 rw-p 00000000 00:00 0 [heap] >> >> >> b576c000-b576d000 ---p 00000000 00:00 0 >> >> >> b576d000-b5f6d000 rw-p 00000000 00:00 0 >> >> >> b5f6d000-b5f6e000 ---p 00000000 00:00 0 >> >> >> b5f6e000-b676e000 rw-p 00000000 00:00 0 >> >> >> b676e000-b676f000 ---p 00000000 00:00 0 >> >> >> b676f000-b6f6f000 rw-p 00000000 00:00 0 >> >> >> b6f6f000-b6f70000 ---p 00000000 00:00 0 >> >> >> b6f70000-b7772000 rw-p 00000000 00:00 0 >> >> >> b7781000-b7783000 rw-p 00000000 00:00 0 >> >> >> bfce4000-bfee7000 rw-p 00000000 00:00 0 >> >> >> >> >> >> ... >> >> >> [7591] mtcp.c:2172 checkpointhread: >> >> >> everything resumed >> >> >> [7591] mtcp.c:3511 stopthisthread: >> >> >> tid 7615 returning to 0x110400 >> >> >> [7591] mtcp.c:3488 stopthisthread: >> >> >> thread 7616 resuming >> >> >> [Segmentation fault >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> >> Live Security Virtual Conference >> >> >> Exclusive live event will cover all the ways today's security and >> >> >> threat landscape has changed and how IT managers can respond. >> >> >> Discussions >> >> >> will include endpoint security, mobile security and the latest in >> >> >> malware >> >> >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> >> >> _______________________________________________ >> >> >> Dmtcp-forum mailing list >> >> >> Dmt...@li... >> >> >> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum >> >> > >> >> > >> >> >> > >> > >> > > |