native icons dragging start to cause crashes

kas1e
2013-09-18
2013-10-12
  • kas1e
    kas1e
    2013-09-18

    Tested our latest SVN and there is new crash come up from native icons handling: once we grab and drag any icon on desktop and then realise mouse button, all crashes with such stack trace:

    Stack trace:
    (0x682D89E0) native kernel module graphics.library.kmod+0x00076c40
    (0x682D8A40) native kernel module graphics.library.kmod+0x00076f84
    (0x682D8B10) [aos4_ppc_libstubs.c:341] libstub_L_DrawDragList()+0x1c (section 1 @ 0x47C08)
    (0x682D8B20) [backdrop_drag.c:220] backdrop_stop_drag()+0x270 (section 1 @ 0x1D19C)
    (0x682D8B50) [backdrop_buttons.c:397] backdrop_handle_button()+0x380 (section 1 @ 0x1811C)
    (0x682D8BA0) [backdrop_idcmp.c:255] backdrop_idcmp()+0x3e4 (section 1 @ 0x177BC)
    (0x682D8BF0) [event_loop.c:752] event_loop()+0xb84 (section 1 @ 0x11210)
    (0x682D8CD0) [main.c:81] main()+0x1c8 (section 1 @ 0x7694)
    (0x682D8D00) native kernel module newlib.library.kmod+0x000020a4
    (0x682D8D70) native kernel module newlib.library.kmod+0x00002d54
    (0x682D8F10) native kernel module newlib.library.kmod+0x00002ee8
    (0x682D8F50) _start()+0x170 (section 1 @ 0x16C)
    (0x682D8F90) native kernel module dos.library.kmod+0x00024ab4
    (0x682D8FC0) native kernel module kernel+0x0006aa5c
    (0x682D8FD0) native kernel module kernel+0x0006aadc
    

    DAR register point out on DAR: CCCCCCE0, what can mean something about Node being freed two times or something like that.

    backdrop_date.c:220 is :

    DrawDragList(&GUI->drag_screen_rp,&info->window->WScreen->ViewPort,(info->flags&BDIF_CUSTOM_DRAG)?DRAGF_CUSTOM|DRAGF_REMOVE:0);
    

    I rollback firstly commit #r595 (icon.module fix) : didn't help.
    Then i rollback also commit #583 (program/backdrop_render.c change) : that one didn't help too.
    Then i rollback to commit #563 (Make sure the icons with alpha transparency won't get constantly overdrawn when the state isn't changed , TRUE/FALSE change): didn't help.
    Then i rollback to commit #562 (Fixed the size of the background rectangle when the native icons and icon borders are enabled , those 3 on 1 value changes) : didn't help
    Then i rollback to commit #549 (icon.module: use DrawIconState, so i just disable it and rebuild whole module): that didn't help too

    wtf ..:( It worked for sure before. Maybe some special settings ..

    EDIT: ok found something: when "custom icon dragging enabled": all works. But if it disabled, then it didn't. I can grab and move icon freely, but once i realise mouse button => crash.

     
    Last edit: kas1e 2013-09-18
  • kas1e
    kas1e
    2013-09-18

    @all

    Ok, its all pretty interesting in compare with first look.

    Problem was there since beginning i assume and its original/legacy bug (the same as it was in filetype.module, and in some place of dopus itself).

    It happens even without native icon, just with all original code when "custom icon dragging" is DISABLED. I.e. when custom icon dragging not used.

    The reasons why i notice it now, its just yesterday i put back debug.kernel on os4 which can catch more bugs than user one, together with adding "munge" options which can catch some more problems (those 0xCCCCCCC in DAR , point out exactly that "munge" of debug.kernel works).

    So it was like this: before it always was "custom icon dragging" enabled in environemnt/icons_settings, and all works even on debug.kernel. Then, when we enable native icons i just see they works, all ok (but didn't drag them without custom dragging). Then, i put user kernel , and do tests on it, and of course didn't notice any bugs, as user kernel can't catch them. And today, i just have debug.kernel, and custom icon dragging disabled, and viola : "munge" of debug.kernel catch a bug and point out on the line caused it.

    To prove that i am right , i just builds fully rev529 from scratch (i.e. no nativeicons in use for os4 was), and test it with disabled/enabled custom icon dragging on user kernel: no problems. Then, just put debug.kernel with enabled "munge" option, reboot and test the same build: with custom icon dragging all ok, without (i.e. when disabled) - such a crash. So, its a original bug which need to be fixed and which we just miss before and catch only now.

     
  • BSzili
    BSzili
    2013-09-18

    On AROS non-custom dragging shows no icon, just a gray square, so there's definitely some problem with it.

     
  • kas1e
    kas1e
    2013-10-11

    @BSzili

    Found today pretty nasty bug related to native-icons, but which happens when you just tryed to create new filetype. To reproduce:

    -- run dopus5
    -- Settings->Filetypes
    -- press "Add" => null_pointer crash.

    Stack Trace:

    Stack trace:
    (0x64711E10) native kernel module newlib.library.kmod+0x0002fbf0
    (0x64711F90) [filetype_editor.c:44] FiletypeEditor()+0xf8 (section 1 @ 0x25560)
    (0x64711FC0) native kernel module kernel+0x0006acd0
    (0x64711FD0) native kernel module kernel+0x0006ad50
    

    file filetype_editor.c are from configopus.module, and line 44 of it are in the native_icons block. There:

        // Get icon image
    #ifdef USE_DRAWICONSTATE
        {
            char *path_copy;
            data->icon_image=NULL;
    line 44:    if ((path_copy=AllocMemH(0,strlen(data->type->icon_path)+1)))
            {
                // icon_path is guaranteed to have a .info at the end
                stccpy(path_copy,data->type->icon_path,strlen(data->type->icon_path)-4);
                data->icon_image=GetCachedDiskObject(path_copy,0);
                FreeMemH(path_copy);
            }
        }
    #else
        data->icon_image=OpenImage(data->type->icon_path,0);
    #endif
    

    DAR are 0x00000000 which mean null-pointer access and one of registers have 0xDEADBEEF , which mean memory dances.

    Before i didn't get it because i for few weeks don't create new filetypes. So it was here for quite some time. Commit 623 (where you add change to configopus.module) was in 2013-09-22, so i assume since that time.

     
    Last edit: kas1e 2013-10-11
  • BSzili
    BSzili
    2013-10-11

    I can't reproduce this on AROS, otherwise I wouldn't have committed the code above. You have to find out which variable holds the NULL pointer and why.

     
  • kas1e
    kas1e
    2013-10-11

    In meantime there is ticket for: ticket #20.

     
  • kas1e
    kas1e
    2013-10-12

    @BSZili
    Didnt' have time for found the roots of null-pointer, so in meantime will just add amigaos4 ifndef in that part, so it will not crashes at least until we will not found why its null pointer.

     
    Last edit: kas1e 2013-10-12
  • BSzili
    BSzili
    2013-10-12

    I suggest you to compile without USE_DRAWICONSTATE instead.