OS3 Dopus5 crash

xenic
2014-04-14
2014-05-22
1 2 > >> (Page 1 of 2)
  • xenic

    xenic - 2014-04-14

    I just discovered a crash in the OS3 Dopus5 program. When you start some external programs with OS3 Dopus5 there is a crash in DO_LAUNCHER. Here is how I produced the crash:

    Select the Buttons/Edit menu item.
    After the "Button Bank Editor" window opens click on an empty button.
    Select Edit in the Button Bank editor.
    In the Button Editor enter a Name & Label.
    Select the "Edit Function Editor" gadget.
    In the Function Editor select the Add gadget.
    Change the function type to AmigaDOS.
    Enter "dopus5:C/viewfont {Qs}" in the string gadget.
    Select "Save" or "Use" to exit the editors.
    Select the button you just defined.
    Viewfont will open but a Grim Reaper appears with DO_LAUNCHER as the crashed process.

    Unfortunately, the OS4 grim reaper doesn't show a decent stack trace for OS3 programs.
    Maybe someone can perform a similar test and get a better idea of what is going wrong.

    I'm not sure when this bug started but I don't think it's caused by the 64 bit update.

     
  • kas1e

    kas1e - 2014-04-24

    @Xenic
    I found problem : its debug prinfs.

    You may remember that we have the same kind of crash in os3 version when in wb.c we do prinfs with "lib_OpenCnt bumped to balblabl" , which is now commented for os3, and have such warning :

    wb.c:711: warning: #warning D(bug) printf there cause crash on os3 build (and on real, and on emulation on os3/mos). TODO: invistigate and make a proper one
    

    So, we have the same problem in library/launcher.c too. Once i comment out 2 "lib_OpenCnt bumped to" debug prinfs in the launcher.c, recompile os3 version of library: then crash gone. But then, it crashes when we close our running program, and at this time on another printf in launcher.c:

    [launcher.c:312] lib_OpenCnt decreased to 0 by 'Dump of context at 0xEFD7A000
    

    Once i comment out that one too for os3 (i.e. that debug prinfs about "lib_OpenCnt decreased to" ) crashes gone in whole.

    What did it mean: our debug prinfs there are do something which cause crashes for os3 version because can't get name of program.

    But then, it can be not only os3 related, but just to whole debug prinfs, because, i tried aos4 version again to see what debug output it produce, and then, when i just click on button i have:

    [launcher.c:698 launcher_proc] lib_OpenCnt bumped to 3 by 'execute T:dopus-B198D73F-tmp
    '
    

    So no crash, and 'execute T:balblbala' takes fine (while in os3 version not and crashes).

    But then, i do close the calculator, and in debug i have:

    [launcher.c:312 launch_exit_code] lib_OpenCnt decreased to 2 by 'чґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬Очґ╬О'
    

    Which for me looks like garbled buffer again, even on os4. No crashes of course , but still something about debug prinfs even in os4 version too.

    Probably, easy way will be just remove all the debug output for os3 from launcher (as it done for wb.c now), and all will works fine, but then pretty possible that our debugs are wrong, and need some fixing (and basically, have debug in those parts help a lot as well, when i do tests, its pretty good to see when and what was called, what happens with counters and so on).

    Add to that problem with debugprinfs on os3, there is another one. When we exit from OS4 version of program, and patches removal happens, we have such output:

    [launcher.c:355 check_app_list] lib_OpenCnt decreased to 0
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 88)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 96)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 100)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 108)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 112)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 120)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 1 at LVO 108)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 1 at LVO 196)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 136)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 132)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 152)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 148)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 2 at LVO 112)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 2 at LVO 148)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 3 at LVO 300)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 1 at LVO 104)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 4 at LVO 588)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 0 at LVO 128)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 3 at LVO 240)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 3 at LVO 264)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 3 at LVO 260)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    [wb.c:2470 L_WB_Remove_Patch] waiting for the usecount (0) to be zero (patch type 1 at LVO 472)
    [wb.c:2473 L_WB_Remove_Patch] ok, trying to remove the patch
    

    So far everything looks correctly. Os4 version show us path type, LVO offsets, etc, all ok and as should be. But then, if we run OS3 version, and exit from, then the output we have is:

    [launcher.c:355] lib_OpenCnt decreased to 0
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    [wb.c:2470] waiting for the usecount (0) to be zero (patch type 0 at LVO 0)
    [wb.c:2473] ok, trying to remove the patch
    

    As you can see, and patch type, and LVO offset are 0. Which is wrong as well, and point out that our os3 debug printfs for sure have problems.

     
  • xenic

    xenic - 2014-04-24

    @kas1e
    We have crash problems with debug output in OS3 and bad numbers in some debug output in OS4. Either there is an OS4 problem with debug output or the multiple processes in Dopus5 is causing a problem. Let's try compiling OS3 binary with "debug=no" and see if that fixes the crash. If that works, then you can add "debug=no" for the OS3 overnight compile at dopus5.org and we can focus on other problems for now. I'll do a test compile of OS3 with "debug=no" this morning to see if it works. You can do the same. If it works, then we can add it to the OS3 overnight compile and remember to use "debug=no" when we are compiling at home for testing.

     
  • kas1e

    kas1e - 2014-04-24

    @Xenic
    Imho faster and easy way just comment out few prinfs in launcher.c , and no other changes need it currently, no debug=no needs, no changes on server, no need to remember how to compile, etc. Just disable it for os3, with same warning as i do in wb.c. When (and if) we will have interest to deal with, we can uncomment it back and play with. Basically we anyway need debug output, so better to have warning about (i.e i can tomorrow just comment them out and all will be fine, no other changes need it).

    For final 5.9 release we anyway will build everything without debug, so not big deal.

    Probably later i can just fireup my winuae, and do test os3 version, to see how debug output looks like, and so we will know if it os4's debug output borks, or general problem of our debug prinfs.

     
    Last edit: kas1e 2014-04-24
  • xenic

    xenic - 2014-04-24

    @kas1e
    O.K. if you would rather comment out the debug lines that cause crashes, go ahead and commit that. Either way Os3 version should be O.K. I'm still working on those damn link problems.

     
  • kas1e

    kas1e - 2014-04-26

    @Xenic
    Commented out that with warnings in repo, so os3 version will be ok now. Through, that pretty interesting why it fail. Probably can be something about clib/libnix/newlib differences we use to build os3/os4 versions. Or maybe for os3 need to open additionaly somewhere some library which automatically opens in os4. For sure something about processs communication.

    But as os4 prinfs also sometime return garbage, it can be and some general problem with our debug prinfs. As well as fancy numbers in 64bit debugs when they passed to serial make it feel like something wrong somewhere with debugs.

     
    Last edit: kas1e 2014-04-26
  • tomsmart1

    tomsmart1 - 2014-04-29

    I find the time to test here on real classic with OS3.9 i had no Hits and no crashs if run start a external programs.

    But i have some crashes and hits of the config.module if i test it with full debugtools MuForce and MuGuardianAngel. Sometimes it crash after open its window sometimes if i press the USE button to test my changes.
    Are you intrested in the Muforce logs?

     
  • kas1e

    kas1e - 2014-04-29

    @Tom

    I find the time to test here on real classic with OS3.9 i had no Hits and no crashs if run start a external programs.

    Because i fix them in rev996 (check previous post before your one).

    But i have some crashes and hits of the config.module if i test it with full debugtools MuForce and MuGuardianAngel. Sometimes it crash after open its window sometimes if i press the USE button to test my changes.

    Of course, just also need full step-by-step how to reproduce them. And be sure you on latest version from dopus5.org

     
    Last edit: kas1e 2014-04-29
  • tomsmart1

    tomsmart1 - 2014-04-30

    @kas1e

    Because i fix them in rev996 (check previous post before your one).

    No this was not a fix for, it works before too i test the nightbuilds form 20.04 and 25.04.

    Of course, just also need full step-by-step how to reproduce them. And be sure you on latest version from dopus5.org

    This is a problem because it is a 50/50 change that it work or Crash i test it nearly 50 time i can't find the trigger at the moment. I will have a closer lock tomoro.

     
  • kas1e

    kas1e - 2014-04-30

    @Tom
    I do some more tests on my winaue (pure-pure wb3.1), and there is some bugs i notice:

    1. When i run it and press "amiga+e", it offten have some garbage characters in that requester
    2. When i quit from it, my winaue somehow "blinks", and then, after dopus5 exit and i tried to run it again, winuae crashes.

    Original sasc version didn't have that problems.

    On os4 through all works fine (i mean that os3 version). Maybe something with my winaue setup..

     
    Last edit: kas1e 2014-04-30
  • xenic

    xenic - 2014-04-30

    @kas1e
    The OS3 crashes when starting external commands/programs may have only been a bug when OS3 version is run under emulation on OS4 with certain debug output. The debug output in OS3 binary may be O.K. when run on OS3 hardware or maybe even with UAE. Remember, I even had a problem with OS4 debug output that you fixed by eliminating one debug line.

     
  • kas1e

    kas1e - 2014-05-02

    @Tom
    On what hardware you test os3 version of dopus5 ? Did you able run/exit/run/exit many times without problems on it ? Did you have any problems when do "amiga+e" for run commands few times (garbled characters in the typed string) ?

    I just tested on some winuae, with pure 3.1 ks/wb, and have those problems. Maybe i forgot something , or setup something wrong, but sasc version don't give those problems.

     
  • kas1e

    kas1e - 2014-05-02

    Tried recompile os3 version without debug info at all : same problems on winuae:

    1. when press "amiga+e" i have 8 characters of garbage
    2. when exit from dopus5 it trash something, and when i try to run it again, winuae crashes

    Tried then recompile os3 version without debug info and without -DUSE64_BIT in all modules/program/etc : still same bugs present.

    Btw, offten crash happens just when i quit from dopus5. On os4 , that os3 version works fine always through. I.e. i can run/exit as many times as i wish , as well as when i press amiga+e , there is never any garbage.

     
    Last edit: kas1e 2014-05-02
  • xenic

    xenic - 2014-05-02

    @kas1e
    It's next to impossible to fix a bug that I can't reproduce on my system. However, I did trace the functions being called and have a good guess what's wrong:

    When you enter amiiga+e (or menu item Opus->Execute command) the misc_proc.c misc_proc() function is called. The misc_proc() function calls the super_request_args() function which is just a varargs definition in simplerequest.h. that looks like this:

    #define super _request_args(parent,message,...) \
    ({ \ IPTR __args[] = { __VA_ARGS__ }; \
    (short) super_request(parent, message, NULL, (ULONG *)&__args); \
    })

    The super_request_args() function assumes that the arguments are aligned in memory in a specific order and acceses the arguments based on the argument flags by accessing then directly from memory. I think what is happening is that by coincidence the arguments are aligned as expected on OS4 but not on OS3 hardware or UAE. It could be that the varargs definition for super_request_args() isn't working the same with old OS3 GCC compiler as it is with newer GCC compilers.

    The super_request() function is horrible in my view because it accesses memory directly to get arguments instead of passing arguments in a conventional way. I think it just works by coincidence in most parts of Dopus5. The bottom line is that I think I see the problem but can't really debug the problem if I can't reproduce it.

     
  • kas1e

    kas1e - 2014-05-02

    @Xenic

    Interesting ! Maybe you have ideas how to change it on something else in fast way, so i can check on winuae if it indeed the case ? Btw, need to check how many times that super_request function is called in all dopus places.. Hope not very offten..

    Btw, did you have any UAE setup on your os4 ? Probably if you didn't i can just make some ready_to_use archive, where will install pure 3.1 os , and will make all works out of box for you (i.e. just unpack / run). If you in interest of course :)

     
  • xenic

    xenic - 2014-05-02

    @I've never been interrested in running old Amiga games so I don't even know if the UAE stuff that's included with OS4 works on my system. I need a break from Dopus5 and am not planning on doing any programming for a while. There's no point in sending me UAE stuff. You'd be better off finding an OS3 programmer who can continue work on OS3 Dopus5.

     
  • tomsmart1

    tomsmart1 - 2014-05-03

    @kas1e

    I have and use a Amiga 2000 1MB Chip with Blizzard 2060 80MB RAM and GVP Spectrum 2MB with OS3.9 BB3 normaly but if have 3.1 with all needed fixes (con,ram ..).And i use MuLibs and MuTools and for proper testing Muforce and MuGuardianAngel is activ. The only thing that is new is that I use icon.library V46.4.340 for OS3.1 and 3.9 to show OS4 and PNG Icons.

    I test the nightbuild 20140501 to 20140503 and can't replicate your problem. It works for my with 3.9 and 3.1 with CGX and without and also the icon.library makes no difference.

    My reported problem with the Buttonconfig Window is gone too it works now 50 form 50 times, i don't know.

     
  • kas1e

    kas1e - 2014-05-04

    @Tom
    And you can easy exit/run dopus5 from shell as much times as you want ?

    Maybe i should install some fixes for os3.1 to get rid of those issues.. I just tried pure 3.1 install, without any additional patches, and that give me such problems. I just fear to release it like this, till not get why those 2 problem happens on my winuae setup..

    Will try also uae on os4 just to see if there differences.

     
  • tomsmart1

    tomsmart1 - 2014-05-04

    @kas1e

    Yes but i use a stacksize form 8192, if i set it to only 4096 that is the os3.1 default then it crashs with many hits by the third or the fourth start.

     
  • kas1e

    kas1e - 2014-05-04

    @Tom
    Hm.. will check tomorrow. And if you set it to 4096, do you have fancy characters when do "amiga+e" ?

     
  • tomsmart1

    tomsmart1 - 2014-05-04

    No fancy characters if i press "amiga+e", maybe this depenging on the settings of WinUAE. What settings do you use for testing CPU, RAM Chip Fast, ... ?

     
  • xenic

    xenic - 2014-05-04

    @kas1e
    I just checked the stack usage for OS3 Dopus5 running under OS4 emulation with Ranger and the stack used when opening the program and doing nothing else is 7516. Even a stacksize of 8192 may not be enough for heavy use. Tom should probably set his up to 16,384. I just checked the stack size in the icon included in the OS3 archive at dopus5.org and it's set to 80,000. I'm not sure how you guys are getting such low stacksizes unless you're using a different icon or running Dopus5 from a shell or script.

     
  • kas1e

    kas1e - 2014-05-05

    @Xenic,Tom

    Doh, everything fine when i just set stack to 80.000 in the shell. I.e. can run/exit as much times as i want, and no garbled characters in the "amiga+e".

    Now question is: did os3 have some kind of "stack cookie" as os4 have, so , even if i didn't set big enough stack manually, some default value (80.000) will be used by default ?

    Interesting that sasc version works without setting stack in shell manually.

     
  • tomsmart1

    tomsmart1 - 2014-05-05

    @Xenic
    I use this low stack only for testing normaly my Shell stack is 32KB and i use a little hack called minstack with the setting of 32k so all programs run with min 32kB stack.

    @kas1e
    If i know correct the "stack cookie" is only support with OS3.5 and up.

    The sasc version maybe use a little less stack than the GCC version for starting but for full working it need the big stack to. Make a clear Statement in in the ReadMe and Docs that DOpus need a stack from 80000 if it start from Shell.

     
  • xenic

    xenic - 2014-05-05

    @kas1e
    Actually, SASC and LIBNIX have stack variables. The original SASC sets the stack at 8192. We still have those settings in Dopus5 but I think they need to be updated because Libnix may be stripping them out. I think 80000 is overdoing the stack, especially on OS3. OS4 needs a bigger stack than OS3. I'll take a look at the stack settings today and let you know what I find.

     
1 2 > >> (Page 1 of 2)

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks