#86 X11 - 100% cpu in child process (hang) after Xinitscr()

Kevin Lamonte

I never encountered this until running on a multi-core system.

My application calls Xinitscr(), and then in my main thread calls init_pair() which never returns. The child process spawned by XCursesInitscr() is at 100% CPU. Attaching to the child process shows it stuck in _XtWaitForSomething() inside XtAppMainLoop(). The parent process is blocked at the read() in XC_read_socket().

Inserting a sleep(2) at the beginning of _setup_curses() in pdcx11.c alleviates this issue for this machine. Also, manually stepping through the parent process _setup_curses() in gdb makes this non-reproducible.

This looks very similar to https://bugs.freedesktop.org/show_bug.cgi?id=809 .


  • Happy to see this report, it has a lot of interesting info.

    I'd say this seems not related to single/multi core, I am reproducing this very same problem in a single core system. This system is a Debian GNU/linux testing box (running xorg 7.7).

    An interesting info is that I could not reproduce this problem with Debian GNU/linux stable (squeeze) box (running xorg 7.5), so this may be related to recent changes in xorg.

    By the way , I tried with sleep(1) and seems to also work in my box, but since this is a workaround rather than a full fix using sleep(2) should not be a burden and probably makes more systems hide this problem.

  • I can finally reproduce this myself, since I upgraded from OS X 10.6 to 10.8.

  • Replace XtAppMainLoop() call.

  • Here's something interesting... I figured the ultimate way to kill this was to eliminate any possible race conditions by changing to a single-process model (something I wanted to do anyway). As the very first baby step in that direction, I replaced the call to XtAppMainLoop() with the canonical implementation of that function (as seen at http://www.cs.cf.ac.uk/Dave/X_lecture/node16.html for example) -- and the problem went away, for me. I'd be interested to hear if this works for others. Patch attached as event-patch.diff.

    • assigned_to: nobody --> wmcbrine
    • status: open --> open-accepted
  • Another point -- if I change to the code shown here: http://www.spinics.net/lists/xorg/msg52062.html -- the version that uses XtAppProcessEvent() -- the hang returns. Follow-ups suggest this may actually be the way XtAppMainLoop() is implemented in X.org now (I don't have source handy to check).

  • I've gone ahead and added this patch to CVS for now.