Menu

#36 Crash when opening Bookmarks window

next
closed
None
5
2023-03-04
2021-02-08
jhf2442
No

Since worker/4.6.0 (?), worker either crashes when opening bookmarks window through shortkey, or hangs and has to be xkill-ed
Error message on console :
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
Abort

using strace :
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=35922, si_uid=20325, si_status=SIGABRT, si_utime=12, si_stime=13} ---
rt_sigreturn({mask=[INT CHLD]}) = -1 EINTR (Interrupted system call)
wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT}], WNOHANG|WSTOPPED, {ru_utime={tv_sec=0, tv_usec=132005}, ru_stime={tv_sec=0, tv_usec=147153}, ...}) = 35922
wait4(-1, 0x7ffd4d510750, WNOHANG|WSTOPPED, 0x7ffd4d510760) = -1 ECHILD (No child processes)
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(18, "Abort\n", 6Abort
) = 6

any possibility to get some more debugging output ?

Discussion

  • Ralf Hoffmann

    Ralf Hoffmann - 2021-02-08

    Hi,

    Worker uses asynchronous background tasks to check for existence of bookmark entries. I can understand a hang if the path to check is on some file system that is temporarily not available (like a NFS mount), but it should not crash. Do you have a special file in your bookmarks which is not a regular local file (either on a mounted FS, or even some special file like a pipe or block device or so)?
    You can run Worker with gdb to get more info:
    gdb worker
    (gdb) run
    ....
    (gdb) bt

     
  • jhf2442

    jhf2442 - 2021-03-04

    Hi Ralf,

    sorry, took quite a long time to bring worker to crash. Thought it would never fail when running inside gdb :-)

    here's the stack trace - do you want me to select a special frame to get more information ? I kept gdb open, so I can get just any information

    Thread 1 "worker" received signal SIGABRT, Aborted.
    0x00007ffff5258387 in raise () from /lib64/libc.so.6
    (gdb) where
    #0  0x00007ffff5258387 in raise () from /lib64/libc.so.6
    #1  0x00007ffff5259a78 in abort () from /lib64/libc.so.6
    #2  0x00007ffff5b991d5 in __gnu_cxx::__verbose_terminate_handler () at ../../.././libstdc++-v3/libsupc++/vterminate.cc:95
    #3  0x00007ffff5b96fa6 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:47
    #4  0x00007ffff5b96ff1 in std::terminate () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:57
    #5  0x00007ffff5b97289 in __cxxabiv1::__cxa_rethrow () at ../../.././libstdc++-v3/libsupc++/eh_throw.cc:131
    #6  0x00000000004bd483 in std::future<std::result_of<std::decay<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>::type ()>::type> std::async<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>(std::launch, std::decay&&, (std::decay<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>::type&&)...) ()
    #7  0x00000000004bca91 in DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&) ()
    #8  0x00000000004b8c79 in DirBookmarkUI::DirBookmarkUI(Worker&, AGUIX&, BookmarkDBProxy&)::{lambda(FieldListView*, int, int, FieldListView::on_demand_data&)#1}::operator()(FieldListView*, int, int, FieldListView::on_demand_data&) const ()
    #9  0x00000000004bd5a6 in std::_Function_handler<void (FieldListView*, int, int, FieldListView::on_demand_data&), DirBookmarkUI::DirBookmarkUI(Worker&, AGUIX&, BookmarkDBProxy&)::{lambda(FieldListView*, int, int, FieldListView::on_demand_data&)#1}>::_M_invoke(std::_Any_data const&, FieldListView*&&, int&&, FieldListView*&&, FieldListView::on_demand_data&) ()
    #10 0x000000000077bfd8 in std::function<void (FieldListView*, int, int, FieldListView::on_demand_data&)>::operator()(FieldListView*, int, int, FieldListView::on_demand_data&) const ()
    #11 0x000000000076e1d0 in FieldListView::updateOnDemandData() ()
    #12 0x000000000076e34e in FieldListView::redrawContent() ()
    #13 0x0000000000777028 in FieldListView::resetWin(bool) ()
    #14 0x00000000007702d3 in FieldListView::resize(int, int) ()
    #15 0x000000000077580b in FieldListView::maximizeY() ()
    #16 0x00000000004bc557 in DirBookmarkUI::maximizeWin() ()
    #17 0x00000000004b9ea7 in DirBookmarkUI::mainLoop() ()
    #18 0x00000000004b6772 in DirBookmarkOp::run(WPUContext*, ActionMessage*) ()
    #19 0x000000000072cbd2 in WPUContext::next(ActionMessage*) ()
    #20 0x00000000006f3499 in Worker::interpret(std::vector<std::shared_ptr<FunctionProto>, std::allocator<std::shared_ptr<FunctionProto> > > const&, ActionMessage*) ()
    #21 0x00000000006f3f67 in Worker::activateShortkey(agmessage*) ()
    #22 0x00000000006ed4a0 in Worker::run() ()
    #23 0x00000000006ee80f in main ()
    
     
  • jhf2442

    jhf2442 - 2021-03-04

    Regarding your question : yes, I have many NFS-mounted paths in my bookmarks, several of them not always mounted. So far worker was showing them a strikethrough, was fine. And there was no delay as the directory was simply non-existent (bookmark is to a subdirectory of the main mountpoint)

     
  • jhf2442

    jhf2442 - 2021-03-05

    unfortunately
    1) I oversaw the IT service time this evening -> all hosts will be rebooted, gdb session is lost
    2) worker executable has no debug information (stripped executable ?)
    I could only extract following info prior to reboot, I assume it's poor information content

    (gdb) frame 6
    #6  0x00000000004bd483 in std::future<std::result_of<std::decay<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>::type ()>::type> std::async<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>(std::launch, std::decay&&, (std::decay<DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&)::{lambda()#1}>::type&&)...) ()
    (gdb) frame 7
    #7  0x00000000004bca91 in DirBookmarkUI::updateLVData(int, int, FieldListView::on_demand_data&) ()
    (gdb) frame 8
    #8  0x00000000004b8c79 in DirBookmarkUI::DirBookmarkUI(Worker&, AGUIX&, BookmarkDBProxy&)::{lambda(FieldListView*, int, int, FieldListView::on_demand_data&)#1}::operator()(FieldListView*, int, int, FieldListView::on_demand_data&) const ()
    (gdb) frame 9
    #9  0x00000000004bd5a6 in std::_Function_handler<void (FieldListView*, int, int, FieldListView::on_demand_data&), DirBookmarkUI::DirBookmarkUI(Worker&, AGUIX&, BookmarkDBProxy&)::{lambda(FieldListView*, int, int, FieldListView::on_demand_data&)#1}>::_M_invoke(std::_Any_data const&, FieldListView*&&, int&&, FieldListView*&&, FieldListView::on_demand_data&) ()
    (gdb) frame 10
    #10 0x000000000077bfd8 in std::function<void (FieldListView*, int, int, FieldListView::on_demand_data&)>::operator()(FieldListView*, int, int, FieldListView::on_demand_data&) const ()
    

    will relaunch worker on monday, using a non-striped executable, and hope that the crash will reproduce in shorter time frame

     
  • Ralf Hoffmann

    Ralf Hoffmann - 2021-03-07

    Hi,

    thanks for the debug output. I'm now able to reproduce the issue. It looks like there are too many async jobs open so eventually there will the exception "Resource temporarily unavailable". I think there must be at least one entry for which the test for existence blocks more or less forever.
    I will look into ways to implement a workaround.
    Maybe, for verification, you could run worker within gdb, open the bookmark dialog a couple of times, scroll through all the items to trigger the existence test. Then interrupt the execution within gdb and look how many threads are existing (info threads). Normally, it should just be around 10. If you see a significant amount of additional threads, you can switch into any of those (thread xyz) and do a backtrace there (bt). My guess is that it hangs in some system call (probably stat).
    Of course, having debugging symbols helps a lot.

     
  • Ralf Hoffmann

    Ralf Hoffmann - 2021-03-07
    • status: open --> accepted
    • assigned_to: Ralf Hoffmann
     
  • jhf2442

    jhf2442 - 2021-03-08

    OK, running anew, but for now no errors reported (v4.7.0)
    Did I understand it correctly that the issue should vanish if I reduce the amount of bookmarks ?

     
  • Ralf Hoffmann

    Ralf Hoffmann - 2021-03-09

    No, it's not the amount of bookmarks. It's just that there must be some entries for which stat() never returns so there are more and more background jobs pending until the thread limit is reached. Since the blocking stat is uninterruptible, there is not much I can do. I will add a job limit so eventually Worker will block noticeable until the stat returns. Usually it happens with mounted network devices which are unavailable. I have reproduce the issue with a sshfs mounted file system. After disconnecting from the network, stat() will block uninterruptible until the network comes up again. I will also add an option to disable the existence test so Worker can skip those directories.

     
  • jhf2442

    jhf2442 - 2021-04-17

    Hi Ralf,

    I think we can close this topic. Couldn't reproduce using 4.7.0 and you have anyhow provided a fix in 4.8.0 (also no crashes seen so far)

     
  • Ralf Hoffmann

    Ralf Hoffmann - 2021-07-03
    • status: accepted --> closed
     

Log in to post a comment.